Scientists Develop New Algorithm to Spot AI ‘Hallucinations’

19.06.2024 18:00

Time.com

An enduring problem with today’s generative artificial intelligence (AI) tools, like ChatGPT, is that they often confidently assert false information. Computer scientists call this behavior “hallucination,” and it’s a key barrier to AI’s usefulness.

Hallucinations have led to some embarrassing public slip-ups. In February, AirCanada was forced by a tribunal to honor a discount that its customer-support chatbot had mistakenly offered to a passenger. In May, Google was forced to make changes to its new “AI overviews” search feature, after the bot told some users that it was safe to eat rocks. And last June, two lawyers were fined $5,000 by a U.S. judge after one of them admitted he had used ChatGPT to help write a court filing. He came clean because the chatbot had added fake citations to the submission, which pointed to cases that never existed.

[time-brightcove not-tgx=”true”]

But in good news for lazy lawyers, lumbering search giants, and errant airlines, at least some types of AI hallucinations could soon be a thing of the past. New research, published Wednesday in the peer-reviewed scientific journal Nature, describes a new method for detecting when an AI tool is likely to be hallucinating. The method described in the paper is able to discern between correct and incorrect AI-generated answers approximately 79% of the time, which is approximately 10 percentage points higher than other leading methods. Although the method only addresses one of the several causes of AI hallucinations, and requires approximately 10 times more computing power than a standard chatbot conversation, the results could pave the way for more reliable AI systems in the near future.

“My hope is that this opens up ways for large language models to be deployed where they can’t currently be deployed – where a little bit more reliability than is currently available is needed,” says Sebastian Farquhar, an author of the study, who is a senior research fellow at Oxford University’s department of computer science, where the research was carried out, and is also a research scientist on Google DeepMind’s safety team. Of the lawyer who was fined for relying on a ChatGPT hallucination, Farquhar says: “This would have saved him.”

Hallucination has become a common term in the world of AI, but it is also a controversial one. For one, it implies that models have some kind of subjective experience of the world, which most computer scientists agree they do not. It suggests that hallucinations are a solvable quirk, rather than a fundamental and perhaps ineradicable problem of large language models (different camps of AI researchers disagree on the answer to this question). Most of all, the term is imprecise, describing several different categories of error.

Read More: The A to Z of Artificial Intelligence

Farquhar’s team decided to focus on one specific category of hallucinations, which they call “confabulations.” That’s when an AI model spits out inconsistent wrong answers to a factual question, as opposed to the same consistent wrong answer, which is more likely to stem from problems with a model’s training data, a model lying in pursuit of a reward, or structural failures in a model’s logic or reasoning. It’s difficult to quantify what percentage of all AI hallucinations are confabulations, Farquhar says, but it’s likely to be large. “The fact that our method, which only detects confabulations, makes a big dent on overall correctness suggests that a large number of incorrect answers are coming from these confabulations,” he says.

The methodology

The method used in the study to detect whether a model is likely to be confabulating is relatively simple. First, the researchers ask a chatbot to spit out a handful (usually between five and 10) answers to the same prompt. Then, they use a different language model to cluster those answers based on their meanings. For example, “Paris is the capital of France” and “France’s capital city is Paris” would be assigned to the same group because they mean the same thing, even though the wording of each sentence is different. “France’s capital city is Rome” would be assigned to a different group.

The researchers then calculate a number that they call “semantic entropy” – in other words, a measure of how similar or different the meanings of each answer are. If the model’s answers all have different meanings, the semantic entropy score would be high, indicating that the model is confabulating. If the model’s answers all have identical or similar meanings, the semantic entropy score will be low, indicating that the model is giving a consistent answer—and is therefore unlikely to be confabulating. (The answer could still be consistently wrong, but this would be a different form of hallucination, for example one caused by problematic training data.)

The researchers said the method of detecting semantic entropy outperformed several other approaches for detecting AI hallucinations. Those methods included “naive entropy,” which only detects whether the wording of a sentence, rather than its meaning, is different; a method called “P(True)” which asks the model to assess the truthfulness of its own answers; and an approach called “embedding regression,” in which an AI is fine-tuned on correct answers to certain questions. Embedding regression is effective at ensuring AIs accurately answer questions about specific subject matter, but fails when different kinds of questions are asked. One significant difference between the method described in the paper and embedding regression is that the new method doesn’t require sector-specific training data—for example, it doesn’t require training a model to be good at science in order to detect potential hallucinations in answers to science-related questions. This means it works with similar effects across different subject areas, according to the paper.

Farquhar has some ideas for how semantic entropy could begin reducing hallucinations in leading chatbots. He says it could in theory allow OpenAI to add a button to ChatGPT, where a user could click on an answer, and get a certainty score that would allow them to feel more confident about whether a result is accurate. He says the method could also be built-in under the hood to other tools that use AI in high-stakes settings, where trading off speed and cost for accuracy is more desirable.

While Farquhar is optimistic about the potential of their method to improve the reliability of AI systems, some experts caution against overestimating its immediate impact. Arvind Narayanan, a professor of computer science at Princeton University, acknowledges the value of the research but emphasizes the challenges of integrating it into real-world applications. “I think it’s nice research … [but] it’s important not to get too excited about the potential of research like this,” he says. “The extent to which this can be integrated into a deployed chatbot is very unclear.”

Read More: Arvind Narayanan is on the TIME100 AI

Narayanan notes that with the release of better models, the rates of hallucinations (not just confabulations) have been declining. But he’s skeptical the problem will disappear any time soon. “In the short to medium term, I think it is unlikely that hallucination will be eliminated. It is, I think, to some extent intrinsic to the way that LLMs function,” he says. He points out that, as AI models become more capable, people will try to use them for increasingly difficult tasks where failure might be more likely. “There’s always going to be a boundary between what people want to use them for, and what they can work reliably at,” he says. “That is as much a sociological problem as it is a technical problem. And I don’t think it has a clean technical solution.”

Партнёры Smi24.net

Все новости за 24 часа

Life24.pro

Тимати организовал вечеринку с Погребняк и Дубцовой в турецком Бодруме

Неочевидные услуги в поезде

Говорим о ВИЧ — в эфире, на улицах, в сети

Несахарный диабет: что это за диагноз и почему он не связан с сахаром

Today24.pro

Jovic set for new opportunity after leaving Milan as free agent

Ricky Hatton Names The Best British Fighter Of All Time And It’s Not Lennox Lewis

Not even a 0% mortgage rate would make buying a house affordable in these 6 U.S. cities

£39m United star shouldn't be starting vs Arsenal, was gifting possession to Everton

News24.pro

Магазины удаляются от центра // Ввод торговой недвижимости снизится в 2026 году на 70%

Астраханский ТРЗ Желдорреммаша приступил к серийному капитальному ремонту тепловозов ТЭМ18ДМ

Город Бакуракерт на полуострове Ланжерон

Москвичи высоко оценили мосты НПС

Game24.pro

Раскрой потенциал Мистера Террифика из DC Worlds Collide с этим гайдом

Girl Rescue 1.0.3.3

Обзор на мобильную версию A Game About Digging A Hole

Android-игроки раскритиковали сурвайвл-хоррор Jericho: Survival

Russia24.pro

Дептранс Москвы рекомендовал использовать метро из-за ограничения движения

Комитет Госдумы предложил создать рейтинг предприятий по их молодежной политике

«Турбозавры» на фестивале «Динозавры на каникулах» в ЦДМ на Лубянке

В Москве прошла премия «Триумф Года»

News-life

Представители НПС доложили губернатору Подмосковья о ходе строительства моста в Дмитрове

Карпин о «Динамо»: «У нас нехватка кадров, так скажем. Говорить про “довольны”, “недовольны” — мне надо командой заниматься. А трансферная кампания — прерогатива клуба. Кто&nbs

Продвижение в TikTok для Музыкантов, Артистов, Актеров, Творческих Людей.

В рамках AmberForum состоялся единственный в мире аукцион редкого янтаря

Ru24.net

Два автобуса столкнулись на северо-востоке Москвы, движение перекрыто

Исследование: только треть британцев уверены в своих боевых способностях против крыс

Жителей Подмосковья приглашают за ответами на кладбища

Песков сообщил, что Кремль не планирует предварительно комментарииировать визит Уиткоффа

News.tennis

Осака: Мечтаю о еще одной победе на турнире «Большого шлема»

Свентек высказалась о поражении от Таусон на турнире в Монреале

Хачанов победил Михельсена и вышел в полуфинал турнира ATP в Торонто

В Книгу почета Казани внесут Веронику Кудерметову

29ru.net

Нижегородский экс-чиновник Бортников отправлен под домашний арест

Как проверить качество получаемой медицинской помощи по ОМС...

К доктору – без страха: сеть клиник «Будь Здоров» представила VR-решения для детского здоровья

Полный вперед: эти 2 знака будут порхать в вышине в первой половине августа — звезды построили для них лестницу в небо, к карьере и любви

Музыкальные новости

Poisk-music.ru

Вывод песни для продвижения в Импульсе Яндекс Музыка.

Певец Шаляпин заявил, что друзья Пугачевой возглавляют многие телевизионные каналы

Александра Розенбаума экстренно госпитализировали в Москве

Несчастный случай в Ленинградской области не повлиял на расписание пассажирских поездов

Ria.city

Дептранс Москвы рекомендовал использовать метро из-за ограничения движения

«Турбозавры» на фестивале «Динозавры на каникулах» в ЦДМ на Лубянке

Концерты органной музыки в Москве: волшебство звуков в галерее Ильи Глазунова

Комитет Госдумы предложил создать рейтинг предприятий по их молодежной политике

Rss.plus

Авиационный отряд специального назначения Росгвардии, обеспечивающий охрану космодрома «Байконур» отметил 20-летие

Сергей Собянин. Главное за день

Петросян с женой собрали вещи и сбежали из России — давно хотели

Москвичка получила просроченные продукты в онлайн-заказе из магазина «Магнит»

Auto.russia24.pro

Два автобуса столкнулись на северо-востоке Москвы, движение перекрыто

Автобус попал в ДТП на трассе М-4 под Тулой: что рассказали пассажиры

Вояж, вояж... VOYAH FREE, обзор от CARS.RU

Нейросеть наводят на большую дорогу // Москва расширяет контроль за дорожными авариями, животными и мусором на проезжей части

Putin.russia24.pro

Путин поручил Шувалову реализовать планы по технологическому развитию России

Малайзийский король посетил Россию с официальным визитом

Великое переселение офисов: Путин прогоняет чиновников из Москвы в регионы

Сфотографировавшийся с Путиным мальчик из Китая обратился к нему спустя 25 лет

Health.russia24.pro

«Опухолевый клан»: почему Кейт Миддлтон резко похудела до 41 килограмма

К доктору – без страха: сеть клиник «Будь Здоров» представила VR-решения для детского здоровья

Как проверить качество получаемой медицинской помощи по ОМС...

Врач Харлов: отравление креветками может привести к летальному исходу

Zelensky.russia24.pro

«Хоть в платье, хоть в парике»: слухи о побеге Зеленского распространяются в Киеве

Зеленский добивается визита Эрдогана в Киев

Sport.russia24.pro

Кубок Посла Китая по Вэйци прошёл в Москве

Палиенко покинул клуб «Урарту» из Армении

Худайбердиева указала, что день смерти Гришина стал самым мрачным за последние годы

Игрок «Зенита» ведет переговоры с турецким клубом, сообщили СМИ

Lukashenko.russia24.pro

Лукашенко предупреждает: не стоит соревноваться с крупными государствами

Person.russian.city

Собянин открыл первый флагманский МФЦ для регистрации самоходной техники

Мэр Собянин поделился информацией о новорожденных животных в «Москвариуме»

Мэр Москвы рассказал о новой жизни Большого Каменного моста

Сергей Собянин. Главное за день

Ecology.russia24.pro

Объем российского добычи водных биоресурсов превысил три миллиона тонн

Зачем нужна программная нормализация воды после очистки — объясняет Алексей Горшков

Около 850 тысяч тонн вторсырья собрано в Москве за полгода

В Подмосковье объявили «оранжевый» уровень опасности из-за угрозы наводнения

29ru.net

Жителей Подмосковья приглашают за ответами на кладбища

Скончался нижегородский актер Виктор Кондрашкин

Сотрудник Росгвардии стал победителем турнира по историческому европейскому фехтованию во Владивостоке

Нижегородский экс-чиновник Бортников отправлен под домашний арест

Severodvinsk.ws

Вильфанд предупредил об аномальной жаре в девяти регионах

В Архангельской области ищут работу москвичи и петербуржцы

Аномальная жара: До +41 °С в Чечне и Ингушетии, +30 °С в Карелии и Архангельске

Алтайский край оказался одним из антилидеров по качеству автодорог

Sevpoisk.ru

В Симферополе вспомнили крымскую писательницу, пережившую оккупацию ребенком: 100 лет Елене Криштоф

Поезда в Крым меняют маршруты и график

До 100 метеоров в час: когда наблюдать пик звездопада Персеиды над Крымом

Компания «Гранд Сервис Экспресс» информирует об изменениях в курсировании некоторых поездов «Таврия» с осени 2025 года

103news.com

Скончался нижегородский актер Виктор Кондрашкин

Сотрудник Росгвардии стал победителем турнира по историческому европейскому фехтованию во Владивостоке

Жителей Подмосковья приглашают за ответами на кладбища

С начала 2025 года в Татарстане построили 2,3 млн квадратных метров жилья

Агрегатор новостей 24СМИ