Continual Learning

20 items

RESEARCH↑ trendingReddit r/MachineLearning·27d ago

Learning, Fast and Slow: Towards LLMs That Adapt Continually [R]

Large language models (LLMs) face catastrophic forgetting and plasticity loss when updating parameters for downstream tasks. This work introduces a fast-slow learning framework for LLMs, utilizing model parameters as "slow" weights and optimized context as "fast" weights to adapt efficiently without compromising general reasoning.

LLMs learning Catastrophic Forgetting AI Research

RESEARCHarXiv CS.AI·4/17/2026

Mistake gating leads to energy and memory efficient continual learning

This paper proposes 'memorized mistake-gated learning,' a biologically plausible plasticity rule where synaptic updates are strictly gated by current and past classification errors. This method reduces network updates by 50-80%, significantly enhancing energy and memory efficiency in continual and online learning scenarios.

neural networks efficiency learning algorithms Continual Learning

RESEARCHarXiv CS.LG·4/16/2026

Adaptive Memory Crystallization for Autonomous AI Agent Learning in Dynamic Environments

This research introduces Adaptive Memory Crystallization (AMC), a novel memory architecture designed for autonomous AI agents to progressively consolidate experiences in dynamic environments without forgetting prior knowledge. AMC models memory as a continuous crystallization process across a three-phase hierarchy, inspired by synaptic tagging and capture theory and governed by stochastic differential equations.

reinforcement learning machine learning memory architecture AI agents

RESEARCHarXiv CS.AI·4/14/2026

AHC: Meta-Learned Adaptive Compression for Continual Object Detection on Memory-Constrained Microcontrollers

Adaptive Hierarchical Compression (AHC) is a meta-learning framework for continual object detection on memory-constrained microcontrollers, adapting to evolving task distributions. It employs MAML-based adaptive compression, hierarchical multi-scale compression, and a dual-memory architecture to prevent catastrophic forgetting within a strict 100KB memory budget.

Meta-Learning Adaptive Compression Microcontrollers object detection

RESEARCHarXiv CS.LG·5d ago

Position: Deployed Reinforcement Learning should be Continual

This position paper argues that deployed Reinforcement Learning (RL) agents should engage in continual learning rather than a train-then-fix paradigm. It identifies four sources of non-stationarity post-deployment, highlighting the necessity for agents to continuously adapt to achieve optimal performance in real-world scenarios.

reinforcement learning learning Adaptive AI AI deployment

RESEARCHarXiv CS.CL·4/20/2026

Why Fine-Tuning Encourages Hallucinations and How to Fix It

Large language models often hallucinate facts, a problem exacerbated by supervised fine-tuning (SFT) which degrades pre-trained knowledge. This research proposes a self-distillation SFT method, inspired by continual learning, to mitigate hallucinations by regularizing output-distribution drift while effectively acquiring new factual information.

hallucinations large language models Fine-tuning Continual Learning

ARTICLEDEV.to AI·28d ago

DeepMind’s CEO Says AGI May Be ~4 Years Away. The Last Three Missing Pieces Are Not What Most People Think.

DeepMind CEO Demis Hassabis predicts AGI could arrive around 2030, identifying three critical missing pieces in current AI: continual learning, long-term reasoning, and real memory. He describes today's models as exhibiting "jagged intelligence," with strong peaks alongside brittle failures.

DeepMind AGI Reasoning AI development

RESEARCHDEV.to AI·4/26/2026

Deep Generative Dual Memory Network for Continual Learning

The title "Deep Generative Dual Memory Network for Continual Learning" describes a deep generative neural network architecture. It aims to enable continual learning, allowing the model to acquire new information without forgetting previously learned knowledge, by employing a dual memory approach.

neural networks deep learning Continual Learning Generative AI

ARTICLEDEV.to AI·24d ago

Meta-Optimized Continual Adaptation for heritage language revitalization programs under multi-jurisdictional compliance

The author realized the critical need for AI in endangered language preservation, encountering challenges like catastrophic forgetting in neural machine translation systems and complex multi-jurisdictional data sovereignty laws. The work focuses on meta-optimized continual adaptation for heritage language revitalization programs.

data compliance AI indigenous languages language revitalization

RESEARCHDEV.to AI·4/21/2026

Continual Learning via Neural Pruning

This content likely explores the concept of continual learning, a key challenge in AI, by leveraging neural pruning techniques. It aims to demonstrate how pruning can enable models to sequentially acquire new knowledge without forgetting previously learned information.

neural-pruning machine learning Continual Learning

RESEARCHDEV.to AI·4/14/2026

Don't forget, there is more than forgetting: new metrics for Continual Learning

This content introduces novel metrics for Continual Learning, broadening evaluation beyond just preventing catastrophic forgetting. It proposes a more comprehensive view for measuring AI model performance in sequential learning scenarios.

AI metrics evaluation machine learning Catastrophic Forgetting

RESEARCHarXiv CS.LG·5/1/2026

NORACL: Neurogenesis for Oracle-free Resource-Adaptive Continual Learning

The paper proposes NORACL, inspired by biological neurogenesis, to address the stability-plasticity dilemma in continual learning. It tackles the oracle architecture problem, where finite networks have limited resources for unknown future tasks.

neural networks machine learning neurogenesis Continual Learning

RESEARCHarXiv CS.LG·5/1/2026

When Continual Learning Moves to Memory: A Study of Experience Reuse in LLM Agents

This study investigates the role of external memory in LLM agents for continual learning, showing that the stability-plasticity dilemma resurfaces at the memory level due to limited context windows. A (k,v) framework is introduced to disentangle how experience is represented and organized, finding that abstract procedural memories transfer more reliably than detailed trajectories and finer-grained memory organization is beneficial.

research memory AI agents Continual Learning

RESEARCHarXiv CS.LG·19d ago

CP-MoE: Consistency-Preserving Mixture-of-Experts for Continual Learning

CP-MoE addresses catastrophic forgetting in continual learning for LLMs and VLMs using Mixture-of-Experts architectures. It introduces a transient expert and consistency-preserving routing to integrate new knowledge while preventing the overwriting of existing parameters.

LLMs VLMs learning Mixture of Experts

RESEARCHarXiv CS.LG·5/7/2026

Continual Distillation of Teachers from Different Domains

This research introduces Continual Distillation (CD), a new paradigm where a student model sequentially learns from a stream of teacher models without retaining prior access. It addresses challenges like unseen knowledge transfer (UKT) and forgetting (UKF) through Self External Data Distillation (SE2D), which uses external unlabeled data to stabilize learning across heterogeneous teachers.

Knowledge Distillation deep learning learning Continual Learning

RESEARCHarXiv CS.AI·29d ago

CASCADE: Case-Based Continual Adaptation for Large Language Models During Deployment

This paper introduces Deployment-Time Learning (DTL) as a new stage for LLMs, allowing them to continually adapt from experience post-training without modifying core parameters. It presents CASCADE, a framework that uses an explicit, evolving episodic memory for LLM agents, formalizing experience reuse as a contextual bandit problem with no-regret guarantees.

LLMs adaptation machine learning AI deployment

RESEARCHarXiv CS.LG·12d ago

Architecture-driven Shift: towards a lightweight selector for capturing the trends of logit shift

This paper proposes a new lightweight selector to capture logit shift trends in Continual Learning (CL), a computationally expensive challenge in pre-trained model selection. The research addresses architectural heterogeneity in neural networks by decoupling architecture and data dependency to establish a new theoretical framework.

neural networks model selection learning Logit Shift

RESEARCHarXiv CS.CL·4/6/2026

Revealing the Learning Dynamics of Long-Context Continual Pre-training

Este artigo investiga sistematicamente as dinâmicas de aprendizado do Pré-treinamento Contínuo de Contexto Longo (LCCP) usando o modelo industrial Hunyuan-A13B, rastreando sua evolução por 200 bilhões de tokens. Ele propõe uma estrutura hierárquica para analisar o LCCP em níveis comportamental, probabilístico e mecanicista, abordando as limitações das metodologias atuais de avaliação e pré-treinamento.

Long-Context Continual Pre-training Model Evaluation Pre-training Dynamics large language models

NEWSLangChain Blog·21d ago

Introducing LangChain Labs

LangChain Labs is a new applied research effort focused on continual learning for agents. It aims, with partners, to advance open research on self-improving AI systems.

LangChain self-improving AI AI Research AI agents

ARTICLELangChain Blog·4/5/2026

Continual learning for AI agents

This content discusses continual learning for AI agents, proposing that learning extends beyond just updating model weights. It introduces three distinct layers where learning can occur – the model, the harness, and the context – emphasizing how this perspective changes the approach to building improving AI systems.

Model weights AI system design machine learning AI agents