Why Limit the Residual Stream to Layers and Not Tokens? Persistent Memory for Continuous Latent Reasoning
Large language models (LLMs) face a limitation called the 'concept bottleneck,' where they lose critical facts in deep latent reasoning. This paper proposes AGCLR (Adaptive Gated Continuous Latent Reasoning) to address this by augmenting CoCoNuT with a Gated Concept Stream for persistent memory.
