RESEARCH27
Evaluating Large Language Models in a Complex Hidden Role Game
arXiv CS.CLΒ·May 25, 2026
This research quantifies the deceptive potential of Large Language Models (LLMs) in the social deduction game Secret Hitler, introducing novel metrics and an open-source framework. The study benchmarks LLMs against rule-based algorithms and human games, revealing a gap between conversational ability and strategic depth, and showing that reasoning-enhancement techniques can worsen performance for fascist roles.
Read original β