RESEARCH28
Super Apriel: One Checkpoint, Many Speeds
arXiv CS.LGΒ·April 23, 2026
Super Apriel, a 15B-parameter supernet, has been released, offering four trained mixer choices per decoder layer to enable multiple speed/quality presets from a single checkpoint. This allows for 2.9x to 10.7x decode throughput gains with 96% to 77% quality retention, and also facilitates speculative decoding without a separate draft model.
neural network architecturePerformance optimizationattention mechanismslarge language modelsSpeculative Decoding
Read original β