RESEARCH54
OmniMem: Perturbation-aware Memory Compression for Streaming Audio-Visual LLMs
arXiv CS.AIΒ·June 9, 2026
OmniMem is a memory-efficient streaming framework for audio-visual LLMs, designed to overcome limitations of long-video inference due to increasing video tokens and KV caches. It employs modality-aware memory allocation and perturbation-aware memory selection to preserve informative KV states, enhancing compression and long-range understanding.
Read original β