RESEARCHarXiv CS.CL·20d ago
MedicalBench: Evaluating Large Language Models Toward Improved Medical Concept Extraction
This paper introduces MedicalBench, a new benchmark for evaluating Large Language Models in medical concept extraction from electronic health records. It focuses on implicit medical reasoning and evidence grounding, addressing the challenge of identifying concepts not explicitly stated.
29