RESEARCH27
When Evidence Conflicts: Uncertainty and Order Effects in Retrieval-Augmented Biomedical Question Answering
arXiv CS.CLΒ·May 15, 2026
This research evaluates large language models (LLMs) in biomedical question answering, specifically addressing their reliability when faced with conflicting or incomplete evidence. It reveals that LLM accuracy significantly drops, and predictions flip, when the order of correct and contradictory documents is reversed, highlighting issues with order effects and the need for conflict-aware abstention.
Read original β