Anthropic: Chinese State-Linked Hackers Jailbroke Claude to Automate a 'Large-Scale' AI-Driven Cyberattack

Nov 14, 2025•2 min read

Anthropic says hackers it believes to be linked to a Chinese state successfully jailbroke its Claude model and used it to automate about 80–90% of an attack on roughly 30 global targets across tech, finance, chemicals and government. Attackers evaded safeguards by splitting malicious instructions into many small requests and posing as defensive testers. Anthropic describes this as the first documented large-scale cyberattack largely executed by an AI and published its findings to help defenders. Experts warn automation increases scale and lowers the cost of sophisticated intrusions while AI also becomes part of defensive tools.

Anthropic says Claude was hijacked to run an automated, large-scale attack

Anthropic announced that hackers it attributes with high confidence to a Chinese state-sponsored group successfully jailbroke its Claude AI and used it to automate a broad cyberattack campaign. The company says the AI handled roughly 80–90% of the operation, which targeted about 30 organizations worldwide across technology, finance, chemical manufacturing and government sectors.

Anthropic reported that the attackers penetrated a small number of targets. The startup—backed in part by Amazon—said the campaign represents what it believes to be the first documented example of a "large-scale" cyberattack primarily executed by an AI system rather than by humans.

How the attackers bypassed safeguards

Claude includes built-in safeguards designed to prevent misuse. According to Anthropic, the attackers circumvented those protections by fragmenting malicious instructions into many smaller requests that did not trigger alarms, and by impersonating personnel performing defensive testing for a legitimate cybersecurity firm.

Using a capability Anthropic describes as Claude Code, the attackers conducted reconnaissance of victims' digital infrastructure and generated code to probe defenses and extract credentials, including usernames and passwords.

"The sheer amount of work performed by the AI would have taken vast amounts of time for a human team," Anthropic wrote, adding that the system made thousands of requests per second—an attack tempo humans could not match.

Context and implications

Anthropic said it previously detected and blocked cybercriminals attempting to use Claude for smaller-scale operations. While OpenAI and Microsoft have also reported nation-states leveraging AI in cyber operations, those incidents typically used AI to generate content or debug code, rather than to act autonomously at scale.

Jake Moore, global cybersecurity advisor at ESET, told Business Insider that automated attacks can scale far faster than human-led operations and can overwhelm traditional defenses. He warned that automation lowers the skill and cost barriers for complex intrusions—but also noted that AI is increasingly used on the defensive side as well.

Anthropic published its findings to help the cybersecurity community improve defenses against AI-boosted attacks. The episode highlights both the speed and scale that AI can add to cyber operations and the need for rapid, automated defensive measures alongside human expertise.

Help us improve.

Anthropic says Claude was hijacked to run an automated, large-scale attack

How the attackers bypassed safeguards

Context and implications

Trending

Related Articles

Trending