RESEARCH27
LKV: End-to-End Learning of Head-wise Budgets and Token Selection for LLM KV Cache Eviction
arXiv CS.LGΒ·May 11, 2026
This paper introduces LKV (Learned KV Eviction), a novel approach to optimize Key-Value (KV) cache memory in Large Language Models (LLMs). LKV formulates KV compression as an end-to-end differentiable optimization problem, learning budgets and token selection to overcome limitations of heuristic methods.
Read original β