OpenAI Identifies Reason ChatGPT ‘Hallucinates’

OpenAI has published new research explaining why ChatGPT, its widely used language model, sometimes produces false but convincing information—a phenomenon known as “hallucination.”

According to the company, the root cause lies in the way these models are trained and evaluated, processes that reward guessing over admitting uncertainty.

Newsweek contacted OpenAI for more information outside normal working hours.

Why It Matters

Large language models such as ChatGPT are increasingly being used in education, health care, customer service and other fields where accuracy is critical. Hallucinated outputs—statements that are factually wrong but have the appearance of legitimacy—can undermine trust and cause real-world harm.

What To Know

Despite progress in developing more capable models, including GPT-5, hallucinations remain a persistent issue, especially when models are prompted to generate specific factual information.

The findings, based on research by OpenAI scientists—including Adam Kalai and Santosh Vempala—suggest that structural changes to training incentives were needed to address the problem.

Hallucinations are “plausible but false statements generated by language models,” according to OpenAI’s internal definition.

One example cited in the research involved a chatbot fabricating multiple titles for a researcher’s dissertation, all of them incorrect. In another case, the model gave three different, equally inaccurate dates for the same person’s birthday.

Stock Image: A photo taken on September 1 shows the logo of ChatGPT on a laptop screen, right, next to the ChatGPT application logo on a smartphone screen in Frankfurt, Germany.

Getty Images

This is because of how language models are trained. During pretraining, models learn to predict the next word in a sentence based on massive volumes of text, but they are never shown which statements are false. This statistical process, while effective at generating coherent language, struggles with low-frequency facts such as birth dates and publication titles.

When such models are tested for performance, accuracy is often the only metric considered. That creates incentives similar to multiple-choice tests: It’s statistically better to guess than to say, “I don’t know.” According to the researchers, “If the main scoreboards keep rewarding lucky guesses, models will keep learning to guess.”

To illustrate the problem, the team compared two models on a basic evaluation test. The newer GPT-5 variant had a 52 percent abstention rate and 26 percent error rate. Meanwhile, an older model, OpenAI o4-mini, showed 1 percent abstention but a 75 percent error rate.

What People Are Saying

OpenAI wrote in the research paper: “At OpenAI, we’re working hard to make AI systems more useful and reliable. Even as language models become more capable, one challenge remains stubbornly hard to fully solve: hallucinations. By this we mean instances where a model confidently generates an answer that isn’t true. …

“Hallucinations persist partly because current evaluation methods set the wrong incentives. While evaluations themselves do not directly cause hallucinations, most evaluations measure model performance in a way that encourages guessing rather than honesty about uncertainty.”