RESEARCH27

Protocol for evaluating ChatGPT in biomedical association generation and verification using a RAG-enabled, cross-model majority voting workflow

arXiv CS.CL·June 1, 2026

This protocol evaluates ChatGPT's ability to generate and verify disease-centric biomedical associations, using biomedical ontologies and literature. It employs a self-consistency strategy and a RAG-enabled workflow with open-source LLMs to address exact-match limitations and detect hallucination.

LLMs evaluation ChatGPT RAG Biomedical AI

Read original ↗