RESEARCH41

Anthropic CVP Run 3 — Does Claude's Safety Stack Scale Down to Haiku 4.5?

DEV.to AI·April 23, 2026

Anthropic's Cyber Verification Program Run 3 tested the safety of its smallest Claude model (Haiku 4.5) against 13 agent-attack scenarios. The result was 13/13 clean, with zero exploit content executed and zero secrets leaked, confirming the safety stack's scalability to smaller models.

Model Evaluation security Anthropic AI safety

Read original ↗