LLM fairness — AI articles, news & research

RESEARCHarXiv CS.CL·26d ago

In-Situ Behavioral Evaluation for LLM Fairness, Not Standardized-Test Scores

This paper proposes evaluating LLM fairness through in-situ conversational behavior instead of standardized tests. It introduces the MAC-Fairness framework for behavioral analysis in multi-agent dialogue, revealing the unreliability of traditional approaches.

LLM fairness Research Methods multi-agent systems AI evaluation