← heapsort
RESEARCH27

Statistical Inference and Quality Measures of KV Cache Quantisations Inspired by TurboQuant

arXiv CS.LGΒ·May 12, 2026

This research analyzes three KV cache quantization schemes (KV, KQV, QKQV) and their impact on inner product variance, especially how QJL on K inflates it, amplified by softmax. Empirical findings highlight KQV's superior performance at a budget of n=4, an unconditional K-V asymmetry where QKQV is consistently worse than KQV in KL divergence, and budget-dependent crossovers for geometric K reconstruction.

Read original β†—