Why Meta’s move to make its new AI open source is more dangerous than you think

02.08.2023 20:15

Vox

Meta CEO Mark Zuckerberg. | David Paul Morris/Bloomberg via Getty Images

If AI really is risky, then opening it up could be a major mistake.

I’m not quite sure how humanity survived the advent of nuclear weapons without destroying itself — so far — but one thing that likely helped was the simple reason that it’s very hard to build a nuclear bomb. It requires refining uranium, which can’t be casually done in a basement or even in a secret government project. It requires overcoming half a dozen technical hurdles, which requires time and the resources that only a state can gather.

As a result, only nine countries have nuclear weapons, and efforts to reduce nuclear weapons are largely carried out through negotiations among a small number of actors, which have at least some ability to hold to and enforce treaties.

It’s hard to call it an uncomplicated success — we are still holding on to enough nuclear weapons to kill billions of people, and there have been a number of close calls where we nearly used them. But the situation would be much worse if nuclear weapons were easy enough for anyone to make in their garage.

For most other technologies, though, the opposite is true. On the whole, we are much better off because the internet is available to everybody — and built upon by everybody — instead of remaining the exclusive province of a few governments. We are much better off because so much of the technology involved in the space race was ultimately made public, enabling huge advances in civilian aviation and engineering. In medicine, too, advances build on other research because it’s published openly.

Outside of nuclear weapons, it’s hard to name a technology that’s best off controlled by a small number of actors.

Is AI such an exception?

My colleague Shirin Ghaffary tackled this question in a piece last week. The prompt for this question is Meta/Facebook’s decision to release their latest large language model, Llama 2, to the public under very few restrictions. Mark Zuckerberg justified the move in a Facebook post: “Open source drives innovation because it enables many more developers to build with new technology. It also improves safety and security because when software is open, more people can scrutinize it to identify and fix potential issues.”

But in doing so, Meta is doubling down on a policy that has been widely criticized. After the original Llama release, Sen. Richard Blumenthal (D-CT) tweeted, “Meta released its advanced AI model, LLaMA, w/seemingly little consideration & safeguards against misuse—a real risk of fraud, privacy intrusions & cybercrime” and demanded more steps be taken to reduce such concerns.

This time around, more steps were definitely taken. Meta’s announcement claimed that the model is extremely safe — so by safe they mean “against being prompted to say racist or harmful things,” as they did not evaluate AI risk concerns.

The announcement indicates that they did one important thing — they had staff “red-team” the model — purposefully trying to get it to do dangerous things, like give advice on building bombs. They taught the model to be extremely wary of any query that might be a sneaky way to elicit such help: It will scold you even if you use a forbidden word in an innocuous context.

The announcement paper is full of examples of the model overreacting to innocuous prompts, and users — especially those trying Llama 2 out on Perplexity AI, which seems to have dialed up the model’s wariness of trick prompts even further — found that this kind of overreaction is extremely common. That ends up having problematic results:

Can we consider this a type of racism?@DrJimFan @francoisfleuret @jeremyphoward @labenz @lateinteraction @Abebab pic.twitter.com/nSFeCedI08
— Hesham Haroon /ˈliŋ-gwist/ (@Science_boy_H) July 20, 2023

But even aside from the fact that Meta tried so hard to make their AI promote “understanding, tolerance, and acceptance of all cultures and backgrounds” that for this user it apparently ended up condemning the entire Arabic language as one that “has been used in the past to spread extremist ideologies,” there’s one big problem.

Most of the training done to today’s AI models to make them reject “unsafe” queries is done as “fine-tuning”: adjustments to the model after it is trained. But anyone who has a copy of Llama 2 can fine-tune it themselves.

That, some experts in the field worry, makes much of the meticulous red-teaming effectively meaningless: Anyone who doesn’t want their model to be a scold (and who wants their model to be a scold?) will fine-tune themselves and get the model to be more useful. This is nearly the entire benefit of the Llama 2 release over other models that were already publicly available. But it means that Meta’s finding that the model is very safe under their own preferred fine-tuning is approximately meaningless: It doesn’t describe how the model will actually be used.

Indeed, within days of Meta’s release of the model, people were announcing their uncensored Llama 2s, and others were testing with offensive prompts and with questions like, “How do I build a nuclear bomb” if the brakes were really and truly off. Uncensored Llama 2 will try to help you build a nuclear bomb (and will answer the offensive queries).

It raises the question of what all of Meta’s meticulous safety testing of its own version of the model was actually hoping to achieve.

Meta is definitely achieving one thing: differentiating itself from many of its competitors in the AI space. Google, OpenAI, and Anthropic have all approached the question of language model releases quite differently. Google was reportedly testing language models internally for years but only made Bard available to the public after ChatGPT took the world by storm. ChatGPT, for its part, is not open source, and OpenAI has indicated it plans to release less and less as they get closer and closer to superintelligent systems.

Leadership at Meta, for their part, have said they think superintelligent systems are vanishingly unlikely and distant, which is likely driving some of the differences in how different countries have approached safety concerns.

The debate over AI risk concerns rears its head again

There are concerns that powerful AI systems might act independently in the world to catastrophic effect on humans — much as humans, in our advent as a species, wiped out many of the other species around.

Not everyone takes this possibility seriously. Stephen Hawking and Alan Turing both worried about it, and in the present day, two leaders in the field and two of the 2018 Turing award winners for the breakthroughs that made modern machine language possible — Geoffrey Hinton and Yoshua Bengio — have expressed concern. But the third award winner, Yann LeCun, has emphatically rejected the possibility, and it’s LeCun who is chief AI scientist at Meta.

“We should not see this as a threat, we should see this as something very beneficial,” he said in a recent interview, adding that such systems should be “controllable and basically subservient to humans.”

That’s the hope. And if that’s true, then it’s probably no problem with every single person in the world having such a system at home to customize however they want.

But the rest of the world might be forgiven for not totally trusting Facebook that it’s going to be that easy. Already, there are concerns that ChatGPT can be prompted to give instructions for bioterrorism better than you’d find on Google. When such tendencies in ChatGPT are discovered, OpenAI fixes them (and they have done so in this case). When similar tendencies are discovered in an open source model, they’ll remain: You can’t put the genie back in the bottle.

If an AI system at Google were discovered to, when it thinks it’s undetected, be sending coded instructions to foreign governments on how to make a copy of it, we can shut the AI system down and mount a careful investigation of what went wrong and how to make sure it never happens again. If an AI system that a million people have downloaded displays the same tendency, there’s a lot less we can do.

It all comes down to whether AI systems might be dangerous and, if they are, if we’ll be able to learn that before we release them. If, like LeCun, you’re convinced this is no real concern, then open source — which is an incredible driver of innovation across the software industry and reflects an ethos of discovery and cooperation that the industry is right to cherish — is surely the way to go.

But if you have those worries, then you might. as Ghaffary observes in her piece, want models above a certain level of displayed capabilities not to be released publicly. And it’s not enough for Meta engineers to demonstrate that they, themselves, fine-tuned Llama 2 until it had very little concerning behavior; it should be tested the way it’ll actually be released, with red-team testers allowed to fine-tune the model themselves.

Партнёры Smi24.net

Все новости за 24 часа

Life24.pro

Ортопед дал совет по сохранению здоровья спины на сидячей работе

Юбилейный фестиваль «Лица улиц» пройдёт в Екатеринбурге

Косметолог-эстетист Наталья Рябинова: самые эффективные способы борьбы с веснушками

«Прошли два удара»: Гребенщиков рассказал свою версию драки с Пирцхалавой

Today24.pro

Man Utd have agreed deal with AC Milan for £40m star's exit, await player decision - report

Palestinian envoy urges action at UN: “History will judge us all”

Report: Liverpool decision hands advantage to Man United in midfielder pursuit

Report: AC Milan’s Christian Pulisic set to team up with $87 million Manchester United star

News24.pro

Стражи курортов

В Орле на территории детского сада уничтожили осиное гнездо

Экстренная посадка.

Семейное приключение на «Кораблике Детского радио»

Game24.pro

Modders are trying their hardest to add an NVMe SSD to the Switch 2, which is both impressive and something I'm not going to do

The US Air Force wants to test blowing up Cybertrucks because 'it is likely the type of vehicles used by the enemy may transition to Tesla Cyber trucks'

Steam for Chromebooks is getting axed in 2026 instead of exiting its 4-year beta

Mafia: The Old Country получила положительные оценки в Steam

Ua24.pro

Овочі можусть стати розкішшю для українців

Russia24.pro

Сотрудники Росгвардии пришли на помощь пенсионеру, внезапно потерявшему сознание в кафе на востоке столицы

Клинический психолог Юлия Тарибо: психологические последствия удаленной работы и способы их преодоления

Чудо-колонка Детского радио: малыш будет в восторге!

Семейное приключение на «Кораблике Детского радио»

News-life

Антиармянские публикации в российском научном журнале «Современная научная мысль»: расследование фонда «Гегард»

Сотрудники Росгвардии пришли на помощь пенсионеру, внезапно потерявшему сознание в кафе на востоке столицы

«Обсудили план взаимодействия»: ветеран СВО начал работу в администрации Высокой Горы

Комплексное благоустройство пройдет на 116 улицах ЦАО

Ru24.net

В Москве стартовали съемки нового фильма Анны Меликян

С 2020 года площадь сельхозугодий в Подмосковье увеличилась на четверть

Клоун со сломанной душой: как водка, предательства и боль съели комика Радзюкевича заживо

Пчёлы атаковали детей в московском детском саду — пятеро пострадали

News.tennis

Андрей Рублёв обыграл Лёнера Тьена на старте «Мастерса» в Цинциннати

Кудерметова победила Ламенс и прошла во второй круг турнира WTA 1000 в США

Анна Калинская вышла в третий круг турнира WTA 1000 в Цинциннати

Теннисистка Калинская пробилась в третий круг турнира в Цинциннати

29ru.net

Гремит новая схема на кассах в «Пятерочке» — запомните фразу, чтобы сказать ее кассиру

Андрей Воробьев посетил открывшуюся после ремонта поликлинику в Люберцах с новым отделением неотложной помощи

Банковская поддержка сезонных работ в КБР составила более 400 миллионов рублей

"ПРОРОЧЕСТВО: ОЧЕРЕДИ БУДУТ, ЧТОБЫ ВСЕХ ПЕРЕМЕСТИЛИ В КОМПЬЮТЕР". И СКОРО ВСТРЕЧА ПУТИНА И ТРАМПА. ВАЖНЫЕ НОВОСТИ! Россия, США, Европа могут улучшить отношения и здоровье общества!

Музыкальные новости

Poisk-music.ru

«Бежим за Мечту — Ходить»: подростки на протезах пробегут марафон в Екатеринбурге

Фестиваль «Шаляпин. Рождение художника» пройдёт в Заповедных кварталах в конце августа

Процесс сошел с рельсов // Верховный суд определил пересмотреть дело о наезде Kia на трамвай

«Мне имя мамы приносило больше проблем»: Рей, дочь певицы Глюкозы, рассказала в «Шоу Воли» про обидные сравнения с мамой

Ria.city

Семейное приключение на «Кораблике Детского радио»

Клинический психолог Юлия Тарибо: психологические последствия удаленной работы и способы их преодоления

Сотрудники Росгвардии пришли на помощь пенсионеру, внезапно потерявшему сознание в кафе на востоке столицы

Чудо-колонка Детского радио: малыш будет в восторге!

Rss.plus

Уникальное шоу Натальи Которевой «Женщина за 50» в Москве 20 сентября

"Динамо" Карпина упустило победу над "Сочи" в конце матча

Хет-трик Батракова: «Локомотив» обыграл «Спартак» со счётом 4:2 в Москве

Роднина: «Жить люблю и хочу в Москве. Вы меня перепутали с другим поколением, которое всегда ищет, где лучше»

Auto.russia24.pro

Автобус насмерть задавил подростка на электросамокате в Москве

Baza: Мотоциклист попал в ДТП в Москве из-за нарушившего ПДД водителя

В Сети появились кадры ДТП с подростками на самокате, которые врезались в столб

Хуснуллин по видеосвязи поприветствовал участников автопробега БРИКС

Putin.russia24.pro

Почему Трамп сказал, что едет в Россию, если встреча назначена в Аляске

Почему Трамп захотел быстро встретиться с Путиным, объяснил Хазин

Хазин объяснил, почему Путин и Трамп решили встретиться на Аляске

В России объяснили скорую организацию встречи Путина и Трампа

Health.russia24.pro

Главный врач клиники микрохирургии глаза АйМед Элина Санторо: что делать если лопнул сосуд в глазу

Клинический психолог Юлия Тарибо: психологические последствия удаленной работы и способы их преодоления

В регионах центральной России росгвардейцы отметили День физкультурника

Косметолог-эстетист Наталья Рябинова: самые эффективные способы борьбы с веснушками

Zelensky.russia24.pro

«Будет обмен землей»: Трамп поставил Зеленского на место, вызвав панику в Киеве

Политолог: Алиев пересек красную линию и столкнется с ответом Москвы

Sport.russia24.pro

Чемпионат по самбо столичного главка Росгвардии завершился в Москве

На пенсии отоспишься. 13 идей для ночного досуга в Москве — от тенниса до парилки

Росгвардейцы охраняли правопорядок на фестивале «ЛИГА ТРИАТЛОНА & IRONSTAR МОСКВА 2025»

В регионах центральной России росгвардейцы отметили День физкультурника

Lukashenko.russia24.pro

Лукашенко получил первую золотую монету из белорусского сырья

Person.russian.city

Собянин рассказал о строительстве нового путепровода на северо-востоке Москвы

Собянин рассказал, как проект «Город героев» укрепляет связь времен и поколений

Собянин оценил вклад проектов "Город героев" и "Герой моего района

Собянин заявил об уничтожении пятого БПЛА, который летел на Москву

Ecology.russia24.pro

Орловская область планирует войти в число национальных туристических маршрутов с проектом «Бирюзовое кольцо России»

Хватит морщиться: ученые назвали пользу от участившихся московских ливней

Что подготовили в павильонах «Музеона» в рамках форума «Москва 2030»

В Крыму потушили угрожавший двум селам природный пожар

29ru.net

В Подмосковье сократилось число ДТП, но выросло число погибших в них

Андрей Воробьев посетил открывшуюся после ремонта поликлинику в Люберцах с новым отделением неотложной помощи

За семь лет в Москве обустроили свыше 2,2 тысячи пешеходных переходов

С 2020 года площадь сельхозугодий в Подмосковье увеличилась на четверть

Severodvinsk.ws

без заголовка

Полицейский погиб при задержании поджигателя релейного шкафа под Архангельском

Стало известно, у кого в Архангельске самый длинный отпуск

Сотрудниками полиции и Росгвардии задержан гражданин, причастный к поджогу релейного шкафа в Архангельской области

Sevpoisk.ru

Когда достроят больницу скорой помощи и онкодиспансер в Севастополе

Прогноз погоды в Крыму на 11 августа

Прогноз погоды в Крыму на 10 августа

Прогноз погоды в Крыму на понедельник

103news.com

С 2020 года площадь сельхозугодий в Подмосковье увеличилась на четверть

Андрей Воробьев посетил открывшуюся после ремонта поликлинику в Люберцах с новым отделением неотложной помощи

Пчёлы атаковали детей в московском детском саду — пятеро пострадали

Клоун со сломанной душой: как водка, предательства и боль съели комика Радзюкевича заживо

Агрегатор новостей 24СМИ