← heapsort-ai

spoken language models

1 items

RESEARCHarXiv CS.CL·12d ago

Bridging the Stability-Expressivity Gap: Synthetic Data Scaling and Preference Alignment for Low-Resource Spoken Language Models

This research addresses the Stability-Expressivity Gap in Spoken Language Models (SLMs) for low-resource languages, caused by the extensive use of synthetic data. While synthetic data improves phonetic accuracy, it degrades prosodic expressivity, a phenomenon termed Synthetic Erosion. The paper introduces self-alignment frameworks to recover expressivity.

27