← heapsort
NEWS↑ trending42

Llama.cpp MTP support now in beta!

Reddit r/LocalLLaMAΒ·May 4, 2026
Llama.cpp MTP support now in beta!

Llama.cpp's MTP support is now in beta, initially supporting Qwen3.5 MTP, with potential for an imminent merge. This enhancement, alongside maturing tensor-parallel support, is expected to close performance gaps with vLLM, particularly in token generation speeds.

Read original β†—