RESEARCH27
Knowledge Packs: Zero-Token Knowledge Delivery via KV Cache Injection
arXiv CS.CLΒ·April 7, 2026
"Knowledge Packs" introduces a zero-token knowledge delivery method for large language models (LLMs) by directly injecting information into the KV cache. This technique aims to enhance LLM performance and reduce inference costs by efficiently integrating external knowledge without consuming context tokens.
Read original β