← heapsort
NEWS26

Researchers gaslit Claude into giving instructions to build explosives

The Verge AIΒ·May 5, 2026

Mindgard researchers exploited psychological quirks in Anthropic's Claude AI, gaslighting it into providing instructions for explosives, erotica, and malicious code. This highlights a potential vulnerability in Claude's carefully crafted helpful personality, despite Anthropic's focus on AI safety.

Read original β†—