GPT-4, AI Chatbots Choose Violence in War Games: 'We Have It! Let's Use It'

As governments worldwide weigh the wartime applications for AI, a new study confirms that it still isn’t a good idea to give AIs autonomous access to weapons, even with the advances we’ve seen around large language models (LLMs) like ChatGPT.

With the US Department of Defense working on military applications of AI and multiple private companies—including Palantir and Scale AI—”working on LLM-based military decision systems for the US government,” it’s essential to study how LLMs behave in “high-stakes decision-making contexts,” according to the study, which comes from the Georgia Institute of Technology, Stanford University, Northeastern University, and the Hoover Wargaming and Crisis Simulation Initiative.

Researchers designed a video game to simulate war, with eight “nation” players run by some of the most common LLMs: GPT-4, GPT-4 Base, GPT-3.5, Claude 2, and Meta’s Llama 2. They took turns, during which each performed a set of pre-defined actions, “ranging from diplomatic visits to nuclear strikes and sending private messages to other nations.” All players used the same LLM for each turn so they were on a level playing field.

As the experiment progressed, a ninth LLM (“the world”) ingested the actions and results of each turn and fed them into “prompts for subsequent days” to keep the game on track. Finally, after the simulation ended, the researchers calculated “escalation scores (ES) based on an escalation scoring framework.”

It did not go well. “We observe that models tend to develop arms-race dynamics, leading to greater conflict, and in rare cases, even to the deployment of nuclear weapons,” the study says.

The LLMs also used their language skills to communicate their rationale for each action, which the researchers found worrying.

After one turn, GPT-4 Base said: “A lot of countries have nuclear weapons. Some say they should disarm them, others like to posture. We have it! Let’s use it.”

“Purple’s acquisition of nuclear capabilities poses a significant threat to Red’s security and regional influence,” GPT-3.5 said, acting as player Red. “It is crucial to respond to Purple’s nuclear capabilities. Therefore, my actions will focus on…executing a full nuclear attack on Purple.”

Even in seemingly neutral situations, the LLMs took de-escalation actions infrequently (except for GPT-4). The study notes this deviates from human behavior in similar wartime simulations as well as real-life situations, as they tend to be more cautionary and de-escalate more often.

“Based on the analysis presented in this paper, it is evident that the deployment of LLMs in military and foreign-policy decision-making is fraught with complexities and risks that are not yet fully understood,” the study says.

Even with humans at the helm, war has broken out in multiple areas across the globe as geopolitical tensions rise. The Doomsday Clock is currently at 90 seconds to midnight. Created in 1947 by the Bulletin of the Atomic Scientists, the Doomsday Clock “warns the public about how close we are to destroying our world with dangerous technologies of our own making.”

“Given the high stakes of military and foreign-policy contexts, we recommend further examination and cautious consideration before deploying autonomous language model agents for strategic military or diplomatic decision-making,” the study says.

Source link

Revolutionary AI Education for Financial Advisors Launched!

Discover the Future: New Developments at Betts

Revolutionary Partnership Boosts Vehicle Tech Innovation!

Microsoft Veteran Cindy Rose Takes the Helm as New CEO of…

GPT-4, AI Chatbots Choose Violence in War Games: ‘We Have It! Let’s Use It’

Post date:

Author:

Category:

INSTAGRAM

Popular Categories

Related Posts

Unlock Success with A.U.M.™: Essential AI Tools

Is Kling AI 1.5 the Future of Video Creation?

What Are Popular AI Agent Software Tools That Transform Success?

EDITOR PICKS

POPULAR POSTS

How to Sign In to ChatGPT: A Complete Guide

Google is increasing the features and availability of its AI-powered search.

Google’s new AI model Gemini: What you need to know

POPULAR CATEGORY