RESEARCHarXiv CS.CL·5/5/2026
CLEAR: Revealing How Noise and Ambiguity Degrade Reliability in LLMs for Medicine
The CLEAR framework is introduced to assess how ambiguity and uncertainty impact medical Large Language Models' (LLMs) reliability, moving beyond simplified evaluation benchmarks. It systematically perturbs answer options and their semantic framing, revealing that increased plausible answers degrade LLM performance and caution decreases with uncertain abstention phrasing.
27