← heapsort-ai

clinical decision support

3 items

RESEARCHarXiv CS.AI·8d ago

EHRBench: An Automated and Reliable EHR-based Benchmark for Clinical Decision Making with LLMs

The paper introduces EHRBench, an automated and reliable EHR-grounded benchmark for evaluating LLM-based clinical decision-making, addressing the insufficient understanding of LLMs' reliability in real-world clinical tasks. Its goal is to ensure both scale and quality in the evaluation of Clinical Decision Making (CDM) models.

28