← heapsort
RESEARCH27

SpecTr-GBV: Multi-Draft Block Verification Accelerating Speculative Decoding

arXiv CS.CLΒ·April 30, 2026

SpecTr-GBV is a novel speculative decoding method that unifies multi-draft and greedy block verification to accelerate language model inference. It formulates the verification step as an optimal transport problem, improving both theoretical efficiency and empirical performance by achieving the optimal expected acceptance length.

Read original β†—