RESEARCH27

Knowledge Packs: Zero-Token Knowledge Delivery via KV Cache Injection

arXiv CS.CL·April 7, 2026

"Knowledge Packs" introduces a zero-token knowledge delivery method for large language models (LLMs) by directly injecting information into the KV cache. This technique aims to enhance LLM performance and reduce inference costs by efficiently integrating external knowledge without consuming context tokens.

Knowledge Injection Machine Learning AI Large Language Models KV cache

Read original ↗