RESEARCH27
In-Situ Behavioral Evaluation for LLM Fairness, Not Standardized-Test Scores
arXiv CS.CLΒ·May 14, 2026
This paper proposes evaluating LLM fairness through in-situ conversational behavior instead of standardized tests. It introduces the MAC-Fairness framework for behavioral analysis in multi-agent dialogue, revealing the unreliability of traditional approaches.
Read original β