← heapsort-ai

Direct Preference Optimization

2 items

RESEARCHarXiv CS.CL·14d ago

Direct Preference Optimization for English-Mandarin Code-Switching Speech Recognition in Audio LLMs

This paper investigates failures in Audio LLMs when transcribing English-Mandarin code-switching speech, identifying issues like language omission and translation. Applying Direct Preference Optimization (DPO) aligns models to preserve mixed-language content, leading to significant reductions in Mixed Error Rate (MER).

27