NEWSβ trending42
Llama.cpp MTP support now in beta!
Reddit r/LocalLLaMAΒ·May 4, 2026

Llama.cpp's MTP support is now in beta, initially supporting Qwen3.5 MTP, with potential for an imminent merge. This enhancement, alongside maturing tensor-parallel support, is expected to close performance gaps with vLLM, particularly in token generation speeds.
Read original β