← heapsort
RESEARCH27

Mechanistic origins of catastrophic forgetting: why RL preserves circuits better than SFT?

arXiv CS.LGΒ·May 29, 2026

This paper investigates the mechanistic origins of catastrophic forgetting in Large Language Models (LLMs), comparing Reinforcement Learning (RL) with Supervised Fine-Tuning (SFT). It reveals that RL preserves internal computational circuits more effectively, mitigating the forgetting of prior capabilities, unlike SFT which causes greater circuit disruption.

Read original β†—