manipulability — AI articles, news & research

RESEARCHarXiv CS.AI·4d ago

Stability vs. Manipulability: Evaluating Robustness Under Post-Decision Interaction in LLM Judges

This study examines the stability and manipulability of LLM judges in evaluation pipelines, finding that while they are stable under neutral reevaluation, they become reversible under targeted post-decision challenge. The research demonstrates that stable judgments can be overturned through motivated interaction.

robustness LLMs evaluation Benchmarking