RESEARCH28

EHRBench: An Automated and Reliable EHR-based Benchmark for Clinical Decision Making with LLMs

arXiv CS.AI·June 1, 2026

The paper introduces EHRBench, an automated and reliable EHR-grounded benchmark for evaluating LLM-based clinical decision-making, addressing the insufficient understanding of LLMs' reliability in real-world clinical tasks. Its goal is to ensure both scale and quality in the evaluation of Clinical Decision Making (CDM) models.

LLMs clinical decision support Benchmarking healthcare AI EHR

Read original ↗