RESEARCH27

LEAP: Unlocking dLLM Parallelism via Lookahead Early-Convergence Token Detection

arXiv CS.LG·May 13, 2026

Diffusion Language Models (dLLMs) face scalability limits in parallelism due to overly conservative confidence thresholds that hinder their potential for highly parallel processing. This paper introduces LEAP, a training-free, plug-and-play method that improves dLLM parallelism by detecting early-converging tokens, thereby accelerating decoding.

Diffusion Models Parallel Computing AI large language models model optimization

Read original ↗