RESEARCH27

EchoDistill:Alignment Noisy-to-Clean Self-Distillation for Robust Audio LLMs

arXiv CS.CL·May 26, 2026

EchoDistill is an alignment-based self-distillation framework designed to make Audio Large Language Models (ALLMs) robust to real-world noise. It leverages a frozen clean-audio teacher to guide an inference-time noisy-audio student, optimizing responses via group-relative policy optimization and token-level consistency.

robustness Audio LLMs machine learning Self-Distillation AI Research

Read original ↗