RESEARCHarXiv CS.CL·15d ago
EchoDistill:Alignment Noisy-to-Clean Self-Distillation for Robust Audio LLMs
EchoDistill is an alignment-based self-distillation framework designed to make Audio Large Language Models (ALLMs) robust to real-world noise. It leverages a frozen clean-audio teacher to guide an inference-time noisy-audio student, optimizing responses via group-relative policy optimization and token-level consistency.
27