RESEARCH27

DisaBench: A Participatory Evaluation Framework for Disability Harms in Language Models

arXiv CS.AI·May 14, 2026

DisaBench introduces a participatory evaluation framework to assess disability-related harms in large language models, addressing the inadequacy of general-purpose safety benchmarks. It features a co-created taxonomy of twelve harm categories, a methodology pairing benign and adversarial prompts, and a dataset with human-annotated labels, revealing subtle harms often missed by standard evaluations.

language models benchmarking AI ethics disability harms safety evaluation

Read original ↗