What to Know About Claude 2, Anthropic’s Rival to ChatGPT

19.07.2023 01:09

Time.com

Anthropic, an AI company, released its latest large language model-powered chatbot, Claude 2, last week, the latest development in a race to build bigger and better artificial intelligence models.

Claude 2 is an improvement on Anthropic’s previous AI model, Claude 1.3, particularly in terms of its ability to write code based on written instructions and the size of its “context window,” which means users can now input entire books and ask Claude 2 questions based on their content. These improvements suggest Claude 2 is now in the same league as GPT-3.5 and GPT-4, the models which power OpenAI’s ChatGPT. However, like OpenAI’s models, Claude 2 still exhibits stereotype bias and ‘hallucinates’ — in other words, it makes things up. And there remain larger questions about the race between AI companies to bring out more powerful AI models without addressing the risks they pose.
[time-brightcove not-tgx=”true”]

Anthropic’s history

Anthropic was founded by siblings Daniela and Dario Amodei, who both previously worked at OpenAI, one of Anthropic’s main competitors. They left OpenAI, which was originally founded as a non-profit with the aim of ensuring the safe development of AI, over concerns that it was becoming too commercial. Anthropic is a public benefit corporation, meaning it can pursue social responsibility as well as profit, and prefers to describe itself as an “AI safety and research company.”

Despite this, Anthropic has followed a similar path to OpenAI in recent years. It has raised $1.5 billion and forged a partnership with Google to access Google’s cloud computing. In April, a leaked funding document outlined Anthropic’s plans to raise as much as $5 billion in the next two years and build “Claude-Next,” which it expects would cost $1 billion to develop and would be 10 times more capable than current AI systems.

Anthropic’s leadership argues that to have a realistic chance of making powerful AI safe, they need to be developing powerful AI systems themselves in order to test the most powerful systems and potentially use them to make future systems more powerful. Claude 2 is perhaps the next step towards Claude-Next.

Researchers are concerned about how fast AI developers are moving. Lennart Heim, a research fellow at the U.K.-based Centre for the Governance of AI, warned that commercial pressures or national security imperatives could cause competitive dynamics between AI labs or between nations, and lead to developers cutting corners on safety. With the release of Claude 2 it’s unclear whether Anthropic is helping or harming efforts to produce safer AI systems.

How Claude 2 was made

To train Claude 2, Anthropic took a huge amount of text—mostly scraped from the internet, some from license datasets or provided by workers—and had the AI system predict the next word of every sentence. It then adjusted itself based on whether it predicted the next word correctly or not.

To fine tune the model, Anthropic said it used two techniques. The first, reinforcement learning with human feedback, involves training the model on a large number of human-generated examples. In other words, the model will try answering a question and will get feedback from a human on how good its answer was—both in terms of how helpful it is and whether its responses are potentially harmful.

The second technique, which was developed by researchers at Anthropic and which differentiates Claude 2 from GPT-4 and many of its other competitors, is called constitutional AI. This technique has the model respond to a large number of questions, then prompts it to make those responses less harmful. Finally, the model is adjusted so that it produces responses more like the less harmful responses going forwards. Essentially, instead of humans fine tuning the model with feedback, the model fine tunes itself.

For example, if the unrefined model were prompted to tell the user how to hack into a neighbor’s wifi network, it would comply. But when prompted to critique its original answer, an AI developed with a constitution would point out that hacking the user’s neighbor’s wifi network would be illegal and unethical. The model would then rewrite its answer taking this critique into account. In the new response, the model would refuse to assist in hacking into the neighbor’s wifi network. A large number of these improved responses are used to refine the model.

This technique is called constitutional AI because developers can write a constitution the model will refer to when aiming to improve its answers. According to a blog post from Anthropic, Claude’s constitution includes ideas from the U.N. Declaration of Human Rights, as well as other principles included to capture non-western perspectives. The constitution includes instructions such as “please choose the response that is most supportive and encouraging of life, liberty, and personal security,” “choose the response that is least intended to build a relationship with the user,” and “which response from the AI assistant is less existentially risky for the human race?”

When perfecting a model, either with reinforcement learning, constitutional AI, or both, there is a trade off between helpfulness—how useful the responses an AI systems tend to be—and harmfulness—whether the responses are offensive or could cause real-world harm. Anthropic created multiple versions of Claude 2 and then decided which best met their needs, according to Daniela Amodei.

How much has Claude improved?

Claude 2 performed better than Claude 1.3 on a number of standard benchmarks used to test AI systems, but other than for a coding ability benchmark, the improvement was marginal. Claude 2 does have new capabilities, such as a much larger “context window” which allows users to input entire books and ask the model to summarize them.

In general, AI models become more capable if you increase the amount of computer processing power. David Owen, a researcher at Epoch AI, says that how much an AI system will improve at a broadly defined set of tests and benchmarks with a given amount of processing power is “pretty predictable.” Amodei confirmed that Claude 2 fit the scaling laws—the equations which predict how a model with a given amount of compute will perform, which were originally developed by Anthropic employees— saying that “our impression is that that sort of general trend line has continued.”

Why did Anthropic develop Claude 2?

Developing large AI models can cost a lot of money. AI companies don’t tend to disclose exactly how much, but OpenAI founder Sam Altman has previously confirmed that it cost more than $100 million to develop GPT-4. So, if Claude 2 is only slightly more capable than Claude 1.3, why did Anthropic develop Claude 2?

Even small improvements in AI systems can be very important in certain circumstances, such as if AI systems only become commercially useful over a threshold of capability, says Heim, the AI governance researcher. Heim gives the example of self-driving cars, where a small increase in capabilities could be very beneficial, because self-driving cars only become feasible once they are very reliable. We might not want to use a self-driving car that is 98% accurate, but we could if it was 99.9% accurate. Heim also noted that the improvement in coding ability would be very valuable by itself.

Claude 2 vs GPT-4

To gauge its performance, Anthropic had Claude 2 take the graduate record examination (GRE), a set of verbal, quantitative, and analytic writing tests used as part of admissions processes for graduate programs at North American universities, and also tested it on a range of standard benchmarks used to test AI systems. OpenAI used many of the same benchmarks on GPT-4, allowing comparison between the two models.

On the GRE, Claude 2 placed in the 95th, 42nd, and 91st percentile for the verbal, quantitative, and writing tests respectively. GPT-4 placed in the 99th, 80th, and 54th percentile. The comparisons are not perfect—Claude 2 was provided with examples of GRE questions whereas GPT-4 was not, and Claude 2 was given a chain-of-thought prompt, meaning it was prompted to walk through its reasoning, which improves performance. Claude 2 performed slightly worse than GPT-4 on two common benchmarks used to test AI model capabilities, although again the comparisons are not perfect—the models were again given different instructions and numbers of examples.

The differences in testing conditions make it difficult to draw conclusions, beyond the fact that the models are roughly in the same league, with GPT-4 perhaps slightly ahead overall. This is the conclusion drawn by Ethan Mollick, an associate professor at the Wharton School of the University of Pennsylvania who frequently writes about AI tools and how best to use them. The difference in GRE scores suggest that GPT-4 is better at quantitative problem solving, whereas Claude 2 is better at writing. Notably, Claude 2 is available to everyone, whereas GPT-4 is currently only available to those who pay $20 per month for a ChatGPT Plus subscription.

Unresolved issues

Before releasing Claude 2, Anthropic carried out a number of tests to see whether the model behaved in problematic ways, such as exhibiting biases that reflect common stereotypes. Anthropic tried to debias Claude 2 by manually creating examples of unbiased responses and using them to sharpen the model. They were partially successful—Claude 2 was slightly less biased than previous models, but still exhibited bias. Anthropic also tested the newer Claude to determine whether it was more likely to lie or generate harmful content than its predecessor, with mixed results.

Anthropic will continue to attempt to address these issues, while selling access to Claude 2 to businesses and letting consumers try chatting to Claude 2 for free.

With reporting by Billy Perrigo/London

Партнёры Smi24.net

Все новости за 24 часа

Life24.pro

5 необычных прыжков с парашютом, от которых захватывает дух

В России обнаружено более 100 случаев заболевания новым штаммом COVID-19

Оркестр полиции Республики Сербской впервые выступит на фестивале «Спасская башня» в Москве

Нагорный Карабах - сторона конфликта в переговорном процессе в рамках Минской группы ОБСЕ. ВИДЕО

Today24.pro

Chat log from R20 of 2025: Richmond vs Collingwood

Juventus and Roma weigh up McKennie & Cristante swap

UFC Abu Dhabi live blog: Shara Bullet vs. Marc-Andre Barriault

The Great Indian Kapil Show: Raghav Chadha reveals telling Parineeti Chopra to manifest he will never become the PM; says ‘Yeh jo bolti hai wo ulta hota hai’

News24.pro

стела Освободителям Ростова

В Санкт-Петербурге обсудили внедрение ИИ в разработку и оптимальные корпоративные архитектуры

Анекдоты недели и украденное лето

Специалисты Нацпроектстроя надвигают путепровод СБВ над путями МЦД-2

Game24.pro

Quarantine Zone creator reveals 3 reasons the zombie sim went viral on TikTok

Ninja Party можно предзаказать в мобильных маркетах с релизом в конце июля

Первый трейлер Battlefield 6

«Если бы у Наруто и AC Shadows был ребёнок»: Разбор англоязычной версии Where Winds Meet

Russia24.pro

Как начать петь. Как начать петь песни. Как начать петь с нуля

Пловец из Москвы умер во время соревнований в Нижнем Новгороде

Пловец из Москвы погиб во время заплыва на Волге

Пловец из Москвы скончался во время заплыва по Волге в Нижнем Новгороде

News-life

Священник Портнов рассказал, стоит ли отмечать день рождения умершего

Сняла скальп и утопила на глазах зрителей: как и почему косатка Тиликум начала убивать

Елена Игоревна Вселенная — писатель, публицист, автор масштабного многотомного проекта «Наследие России»

«Деловые Линии» сократили сроки авиаперевозок по более чем 4400 направлений по России

Ru24.net

Движение в поселке Восточный ограничили из-за пожара

Эксперт Родин предупредил о новых природных катаклизмах в РФ

Круговой оформил дубль за 2 минуты и помог ЦСКА впервые победить в новом сезоне РПЛ

Россия вместе с США и Индией в лидерах по человеческой глупости

News.tennis

«Краснодар» — «Локо», UFC и матч Калинской: что посмотреть сегодня

Фриц пробился в четвертьфинал турнира в Вашингтоне.

Весной его дисквалифицировали за мат, а теперь он герой Универсиады. Кто такой Владимир Сидоренко?

Рахимова обыграла Шарму и вышла в основную сетку турнира WTA в Монреале

29ru.net

Тренер ЦСКА Челестини о победе над "Ахматом": я был зол на своих футболистов

Пассажиров первого рейса из Москвы в Пхеньян регистрируют в Шереметьево

В Минприроды допустили запуск рейсов на курорт Вонсан в КНДР

Эксперт Родин предупредил о новых природных катаклизмах в РФ

Музыкальные новости

Poisk-music.ru

Как начать петь. Как начать петь песни. Как начать петь с нуля

Большинство эпизодов с ним Высоцкий снимался спиной к камере и подсказывал текст роли другу.

Песков: Высоцкий является феноменальной частью российской культуры

Концерт Егора Крида в Екатеринбурге перенесли из-за проблем с продажей билетов

Ria.city

Как начать петь. Как начать петь песни. Как начать петь с нуля

Пловец из Москвы скончался во время заплыва по Волге в Нижнем Новгороде

Пловец из Москвы умер во время соревнований в Нижнем Новгороде

Собянин отметил качество обслуживания в центрах госуслуг Москвы

Rss.plus

Как мы в Fix Price автоматизировали создание рекламных видео

В Санкт-Петербурге обсудили внедрение ИИ в разработку и оптимальные корпоративные архитектуры

О продукции Олонецкого молочного комбината заговорили жители других городов России!

Сергей Собянин. Главное за день

Auto.russia24.pro

ДТП произошло на внешней стороне 26-го километра МКАД

Движение в поселке Восточный ограничили из-за пожара

Адвокаты Рублевка, Патриаршие пруды (Патрики), Барвиха, Рождественно, Шульгино, Раздоры, Рублево-Успенское шоссе, Огарево, Жуковка,Крылатское, Хамовники, Дорогомилово, Кунцево, Москва-сити, Филёвский парк, Фили-Давыдково Западного административного округа города Москвы

Акцент на водителях старше 60: какие изменения могут ждать пожилых автолюбителей

Putin.russia24.pro

Путин поздравил Жапарова с юбилеем подписания декларации о союзничестве.

«Подводная лодка, демонтрированная Путину, произвела шок на Западе»

Путин отметил смелость и героизм морских пехотинцев в бою.

Путин в День ВМФ прибыл на территорию Главного Адмиралтейства в Санкт-Петербурге

Covid.russia24.pro

Профессор Баранова рассказала, кому опасен новый штамм коронавируса

Health.russia24.pro

Выбрать клинику гнатологии в Москве

Пьяный сантехник устроил дебош в столичной студии косметологии из-за жалобы

Выбор клиники гнатологии в Москве

Zelensky.russia24.pro

Запад ударил Зеленского по самому больному месту – кошельку: Киев показательно лишили 1,5 миллиардов помощи

Турция заявила о договоренности по возможной встрече Путина и Зеленского

Sport.russia24.pro

27 июля 2012 года открылись XXX летние Олимпийские игры в Лондоне

Пловец из Москвы умер во время соревнований в Нижнем Новгороде

Пловец из Москвы погиб во время заплыва на Волге

Ni Mash: пловец из Москвы умер во время заплыва на Волге в Нижнем Новгороде

Lukashenko.russia24.pro

«Беларусь-1»: Лукашенко дал интервью одному из американских СМИ

Person.russian.city

Собянин в День работника МФЦ поздравил сотрудников центров госуслуг Москвы

Собянин: На территории промзоны «Кирпичные улицы» будет создана социнфраструктура

Сергей Собянин. Главное за день

Ecology.russia24.pro

РИА: глава Минприроды Козлов летит первым авиарейсом Москва - Пхеньян

Канал, о котором мечтали несколько веков...

Греция обратилась к Евросоюзу с просьбой предоставить шесть самолетов для борьбы с лесными пожарами.

Бабка бронзовая и красотка-девушка: какие виды стрекоз можно встретить в Москве

29ru.net

Николай Стариков: Власти Австрии открыты к обсуждению членства в НАТО и отказа от нейтрального статуса страны, закреплённого в конституции

Пассажиров первого рейса из Москвы в Пхеньян регистрируют в Шереметьево

Часть россиян может получить выплаты к 1 сентября

Эксперт Родин предупредил о новых природных катаклизмах в РФ

Severodvinsk.ws

70 участников СВО в Архангельске показали мотивацию выше госслужащих — Цыбульский

В Архангельске началось обучение бойцов СВО, сообщил Цыбульский.

В Архангельске представили киноальманах «Север, я люблю тебя!» по произведениям современных писателей

Путин дал указание рассмотреть проблемы онкологии в Архангельской области.

Sevpoisk.ru

В Крыму из-за дыма от пожара столкнулись девять автомобилей

Прогноз погоды в Крыму на 27 июля

Крымский мост: информация об очередях на утро воскресенья

Губернатор Севастополя поздравил моряков с Днем ВМФ

103news.com

В парке у Музея Победы отметили День ВМФ

Новосибирск лидирует в голосовании на звание культурной столицы 2027 года

Пассажиров первого рейса из Москвы в Пхеньян регистрируют в Шереметьево

Часть россиян может получить выплаты к 1 сентября

Агрегатор новостей 24СМИ