Do Chatbots Hallucinate, Make Things Up, or Outright Lie?

Stephen DeAngelis

July 20, 2023

The news has recently been filled with stories about large language models (LLM) such as ChatGPT and Bard — often referred to as generative artificial intelligence (AI) systems. You can test the models for yourself and quickly discover that not everything being generated is true. The generated misinformation is attributed to AI hallucination — as though the AI model somehow consumed magic mushrooms. Tech writer Oluwademilade Afolabi explains, “Artificial intelligence hallucination occurs when an AI model generates outputs different from what is expected. … A ChatGPT hallucination would result in the bot giving you an incorrect fact with some assertion, such that you would naturally take such facts as truth. In simple terms, it is made-up statements by the artificially intelligent chatbot.”[1] That explanation sounds a lot like former presidential advisor Kellyanne Conway’s famous term, “Alternative facts.” An incorrect fact, is not a fact. It’s a lie. I tend to agree with reporter Rachel Metz who writes, “AI doesn’t hallucinate. It makes things up. … Somehow the idea that an artificial intelligence model can ‘hallucinate’ has become the default explanation anytime a chatbot messes up. … Granting a chatbot the ability to hallucinate — even if it’s just in our own minds — is problematic. It’s nonsense. People hallucinate. Maybe some animals do. Computers do not. They use math to make things up.”[2]

All Misinformation is a Problem for Society

We have come to expect humans — especially politicians — to lie. We don’t expect AI systems to lie; nevertheless, they do. As tech journalist Benj Edwards (@benjedwards) tweeted, “It’s possible that OpenAI invented history’s most convincing, knowledgeable, and dangerous liar — a superhuman fiction machine that could be used to influence masses or alter history.” Rob McDougall, President and CEO of Upstream Works Software adds, “There are three kinds of data — data that is correct; data that is wrong, and data that looks correct but is wrong. The third is the worst kind and LLM hallucination is a great example of such.”[3] LLM lying can be a particularly dangerous phenomenon for businesses using chatbots for customer service. McDougall cites a LinkedIn study conducted by Oliver Day that found an LLM can provide incorrect information to customers as well as generate false hyperlinks. Day concludes, “Any new customer facing technology needs to be tested thoroughly before purchase.”

In today’s world, based on how many adherents there are to conspiracy theory sites and commentators, it seems few people care about fact-checking stories that support their worldview. That means when a LLM provides misinformation to users it is unlikely to be fact-checked and the misinformation spreads. Journalist Cade Metz explains, “Bots like Bard and OpenAI’s ChatGPT deliver information with unnerving dexterity. But they also spout plausible falsehoods, or do things that are seriously creepy, such as insist they are in love with New York Times journalists. … Because of the surprising way they mix and match what they’ve learned to generate entirely new text, they often create convincing language that is flat-out wrong, or does not exist in their training data. A.I. researchers call this tendency to make stuff up a ‘hallucination,’ which can include irrelevant, nonsensical, or factually incorrect answers.”[4] And that can be dangerous.

The Way Ahead

According to Metz and his colleague Karen Weise, “Figuring out why chatbots make things up and how to solve the problem has become one of the most pressing issues facing researchers as the tech industry races toward the development of new A.I. systems.”[6] Tech writer Elena Alston suggests six ways users can help prevent AI hallucinations.[7] They are:

1. Limit the possible outcomes. Alston explains, “When you give it instructions, you should limit the possible outcomes by specifying the type of response you want. … Another, similar tactic: ask it to choose from a specific list of options for better responses. By trying to simplify its answers, you’re automatically limiting its potential for hallucinating.”

2. Pack in relevant data and sources unique to you. Alston writes, “You can’t expect humans to suggest a solution without first being given key information. … For example, say you’re looking for ways to help your customers overcome a specific challenge. If your prompt is vague, the AI may just impersonate a business that can help you. … Your company has the specific data and information surrounding the problem, so providing the AI with that data inside your prompt will allow the AI to give a more sophisticated answer (while avoiding hallucinations).”

3. Create a data template for the model to follow. The last thing you would expect an AI solution to get wrong is math. Alston, however, found that LLMs do occasionally get math wrong. She writes, “The best way to counteract bad math is by providing example data within a prompt to guide the behavior of an AI model, moving it away from inaccurate calculations. Instead of writing out a prompt in text format, you can generate a data table that serves as a reference for the model to follow [such as a data table]. This can reduce the likelihood of hallucinations because it gives the AI a clear and specific way to perform calculations in a format that’s more digestible for it.”

4. Give the AI a specific role — and tell it not to lie. According to Alston, “Assigning a specific role to the AI is one of the most effective techniques to stop any hallucinations. For example, you can say in your prompt: ‘you are one of the best mathematicians in the world’ or ‘you are a brilliant historian,’ followed by your question. If you ask GPT-3.5 a question without shaping its role for the task, it will likely just hallucinate a response. … This doesn’t always work, but if that specific scenario fails, you can also tell the AI that if it doesn’t know the answer, then it should say so instead of trying to invent something. That also works quite well.”

5. Tell it what you want—and what you don’t want. Alston explains, “You can anticipate an AI’s response based on what you’re asking it — and preemptively avoid receiving information you don’t want. For example, you can let GPT know the kinds of responses you want to prevent, by stating simply what you’re after. … By preemptively asking it to exclude certain results, I get closer to the truth.

6. Experiment with the temperature. Kevin Tupper, the Principal Solution Architect at Azure OpenAI, explains, “‘Temperature’ is a parameter used to control the level of creativity in AI-generated text. By adjusting the ‘temperature’, you can influence the AI model’s probability distribution, making the text more focused or diverse.”[7] As Alston notes, the higher the temperature the more likely you are to find that an LLM is “overzealous with its storytelling.”

Despite your best efforts to prevent AI hallucination, Alston insists you must “verify, verify, verify.” She adds, “Whether you’re using AI to write code, problem solve, or carry out research, refining your prompts using the above techniques can help it do its job better — but you’ll still need to verify each and every one of its outputs.” Users, however, should not have to be the final arbiters of truth when using generative AI. Software developers also need to continue work on preventing LLMs from providing misinformation. Irina Raicu, director of the Internet Ethics Program at the Markkula Center for Applied Ethics at Santa Clara University in California, insists, “I think what the companies can do, and should have done from the outset … is to make clear to people that this is a problem. This shouldn’t have been something that users have to figure out on their own. They should be doing much more to educate the public about the implications of this.”[8]

As a result of dangers posed by AI hallucination, large tech companies have told Congress the technology needs to be regulated. Unfortunately, technology always runs ahead of regulation and getting control of AI misinformation is going to be a Sisyphean task.

Footnotes
[1] Oluwademilade Afolabi, “What Is AI Hallucination, and How Do You Spot It?” Make Use Of, 24 March 2023.
[2] Rachel Metz, “AI Doesn’t Hallucinate. It Makes Things Up,” Bloomberg, 3 April 2023.
[3] Rob McDougall, “Chatbot Hallucinations,” LinkedIn.
[4] Cade Metz, “What Makes A.I. Chatbots Go Wrong?” The New York Times, 29 March 2023.
[5] Karen Weise and Cade Metz, “When A.I. Chatbots Hallucinate,” The New York Times, 1 May 2023.
[6] Elena Alston, “What are AI hallucinations and how do you prevent them?” Zapier Blog, 5 April 2023.
[7] Kevin Tupper, “Creatively Deterministic: What are Temperature and Top_P in Generative AI?” LinkedIn, 22 April 2023.
[8] Staff, “Misinformation Machines? Tech Titans Grappling With how to Stop Chatbot ‘Hallucinations’,” Markkula Center for Applied Ethics, 22 April 2023.

On the Road to AI Superintelligence

New knowledge is being generated at such a dramatic rate that humans can no longer be expected to absorb and understand it. Pippa Malmgren, Founder

The Rise of A.I. Is Not Like the Dotcom Bubble

Nearly three decades ago, the world experienced what became known as the dotcom bubble. Many of the start-ups that popped up during that time raised