← heapsort
RESEARCH27

Fidelity, Diversity, and Privacy: A Multi-Dimensional LLM Evaluation for Clinical Data Augmentation

arXiv CS.LGΒ·May 1, 2026

This research proposes using LLMs (DeepSeek-R1, OpenBioLLM-Llama3, Qwen 3.5) for synthetic mental health data augmentation to address data scarcity and privacy regulations. A comprehensive evaluation framework is introduced, assessing semantic fidelity, lexical diversity, and privacy/plagiarism to mitigate risks like mode collapse or memorization.

Read original β†—