← heapsort
RESEARCH27

EchoDistill:Alignment Noisy-to-Clean Self-Distillation for Robust Audio LLMs

arXiv CS.CLΒ·May 26, 2026

EchoDistill is an alignment-based self-distillation framework designed to make Audio Large Language Models (ALLMs) robust to real-world noise. It leverages a frozen clean-audio teacher to guide an inference-time noisy-audio student, optimizing responses via group-relative policy optimization and token-level consistency.

Read original β†—