NEWSβ trending42
Moonshot open-sourced FlashKDA, CUTLASS kernels for Kimi Delta Attention, up to 2.22x over the Triton baseline on H20
Reddit r/LocalLLaMAΒ·April 22, 2026

Moonshot AI has open-sourced FlashKDA, a CUTLASS C++ kernel for Kimi Delta Attention, offering up to 2.22x performance improvement over the Triton baseline on H20 benchmarks. This new implementation integrates with flash-linear-attention and enhances linear attention architectures like KDA.
Read original β