RESEARCH28
Generating High Quality Synthetic Data for Dutch Medical Conversations
arXiv CS.CLΒ·April 14, 2026
This paper presents a pipeline for generating synthetic Dutch medical dialogues using a fine-tuned Large Language Model to address the scarcity of clinical data due to privacy constraints. Evaluations showed strong lexical variety but a scripted conversation flow and issues in domain specificity during qualitative review.
Read original β