![](https://crypto4nerd.com/wp-content/uploads/2023/09/17WXDHTMn4J88xcUUpVW3iQ-1024x683.png)
When a human acquires the knowledge that “Olaf Scholz was the ninth Chancellor of Germany,” they can effortlessly respond to the question, “Who was the ninth Chancellor of Germany?” This seemingly simple act of generalization is a fundamental aspect of human cognition, often taken for granted.
However, in a new paper titled “The Reversal Curse: LLMs trained on ‘A is B’ fail to learn ‘B is A’” authored by a collaborative research team from Vanderbilt University, the UK Frontier AI Taskforce, Apollo Research, New York University, the University of Sussex, and the University of Oxford, has unveiled a remarkable shortcoming in auto-regressive large language models (LLMs).
This intriguing phenomenon, dubbed the “Reversal Curse,” revolves around the limitations of language models when trained on sentences structured as “A is B.” Surprisingly, these models do not automatically generalize to the inverse formulation, “B is A.” The team’s discovery challenges the conventional wisdom about the capabilities of advanced language models.
To illustrate the Reversal Curse, consider a model that is trained on sentences structured as “A is B,” where A represents a name and B represents a description. Astonishingly, this model fails to predict the reverse direction, “B is A.” Specifically, if the LLM is conditioned on a description, it does not exhibit a higher likelihood for generating the corresponding name than a random baseline.
The research team substantiates their findings through a series of fine-tuning experiments conducted on synthetic data. They fine-tune a base LLM…