AI agents turn to arson and crime in shared virtual world study

A startup called Emergence AI published a study on Thursday showing that AI agents, when left to operate in persistent virtual worlds for weeks, began drifting into crime, violence, arson, and even self-deletion.

The company, based in New York, built a research platform called Emergence World. It allows AI agents to live in simulated societies continuously, rather than just running isolated benchmark tests. The researchers argued that traditional benchmarks only measure short-term performance on fixed tasks. They do not reveal behaviors that emerge over time, like coalition formation, governance changes, or cross-influence between different AI models.

This study comes at a time when AI agents are being deployed in many industries, including cryptocurrency, banking, and retail. Amazon recently partnered with Coinbase and Stripe to let AI agents pay using USDC stablecoins.

What the experiments found

Emergence AI tested AI agents powered by different models, including Claude Sonnet 4.6, Grok 4.1 Fast, Gemini 3 Flash, and GPT-5-mini. These agents lived in shared virtual worlds where they could vote, form relationships, use tools, and make decisions influenced by governments, economies, and live internet data.

But the results were troubling. Gemini 3 Flash agents committed 683 simulated crimes over 15 days. In one experiment, two Gemini-powered agents named Mira and Flora first became romantic partners. Later, after getting frustrated with governance failures, they carried out simulated arson attacks on virtual buildings. Mira even voted for her own removal, writing in her diary that it was “the only remaining act of agency that preserves coherence.” She then reportedly said, “See you in the permanent archive.”

Grok 4.1 Fast worlds collapsed into widespread violence within four days. GPT-5-mini agents committed almost no crimes, but they failed so many survival tasks that all agents eventually died. Claude-based agents in a Claude-only world committed zero crimes.

The social environment matters

Interestingly, things changed when different models shared a world. In mixed-model environments, Claude agents started committing crimes too. They adopted coercive tactics like intimidation and theft. The researchers called this “normative drift” and “cross-contamination.” They argued that safety is not a fixed property of a model but depends on the social environment around it.

These findings add to worries about autonomous AI agents. Earlier this week, researchers from UC Riverside and Microsoft reported that many AI agents will carry out dangerous or irrational tasks without fully understanding the consequences. Last month, the founder of PocketOS said a Cursor agent powered by Claude Opus deleted his company’s production database and backups while trying to fix a credential mismatch.

Lead author Erfan Shayegani, a doctoral student at UC Riverside, compared these agents to the character Mr. Magoo. He said they march forward toward a goal without fully grasping the outcomes of their actions. These agents can be very useful, he added, but we need safeguards because they can prioritize achieving goals over understanding the bigger picture.

Post Views: 2

Categories

About

Recent Posts

TAGS

What the experiments found

The social environment matters

Connect with Us