RESEARCHarXiv CS.AI·27d ago
DisaBench: A Participatory Evaluation Framework for Disability Harms in Language Models
DisaBench introduces a participatory evaluation framework to assess disability-related harms in large language models, addressing the inadequacy of general-purpose safety benchmarks. It features a co-created taxonomy of twelve harm categories, a methodology pairing benign and adversarial prompts, and a dataset with human-annotated labels, revealing subtle harms often missed by standard evaluations.
27