on-policy distillation — AI articles, news & research

RESEARCHarXiv CS.CL·1d ago

Data-Efficient Autoregressive-to-Diffusion Language Models via On-Policy Distillation

This paper introduces the On-Policy Diffusion Language Model (OPDLM) for transforming autoregressive models (ARLMs) into diffusion language models (DLMs). It addresses issues like knowledge loss and train-inference mismatch by employing On-Policy Distillation (OPD).

Diffusion Models language models AI models machine learning