RESEARCH27

CLEAR: Revealing How Noise and Ambiguity Degrade Reliability in LLMs for Medicine

arXiv CS.CL·May 5, 2026

The CLEAR framework is introduced to assess how ambiguity and uncertainty impact medical Large Language Models' (LLMs) reliability, moving beyond simplified evaluation benchmarks. It systematically perturbs answer options and their semantic framing, revealing that increased plausible answers degrade LLM performance and caution decreases with uncertain abstention phrasing.

Ambiguity LLMs evaluation Reliability medical AI

Read original ↗