grokking

4 items

RESEARCHarXiv CS.LG·4/16/2026

The Long Delay to Arithmetic Generalization: When Learned Representations Outrun Behavior

This research investigates the 'grokking' phenomenon in transformers, finding that the long delay to generalization in arithmetic models stems from a decoder bottleneck. The encoder acquires relevant structural knowledge early, but the decoder struggles to access it, a hypothesis supported by causal interventions like transplanting encoders.

grokking machine learning representation learning Transformers

RESEARCHarXiv CS.LG·4/16/2026

Spectral Entropy Collapse as an Empirical Signature of Delayed Generalisation in Grokking

This paper identifies normalized spectral entropy as a scalar order parameter for the grokking transition, where models generalize long after memorization. The research shows that entropy collapse precedes generalization, and causal interventions confirm its critical role, providing a predictive model for grokking onset.

neural networks grokking Generalization deep learning

RESEARCHarXiv CS.LG·4/24/2026

ILDR: Geometric Early Detection of Grokking

This paper proposes the Inter/Intra-class Distance Ratio (ILDR) as a novel geometric signal for the early detection of 'grokking' in neural networks. ILDR, computed from second-to-last layer representations, indicates geometric reorganization in representation space before validation accuracy improves, outperforming existing detection methods.

neural networks grokking machine learning early-detection

RESEARCHarXiv CS.LG·28d ago

Feature Repulsion and Spectral Lock-in: An Empirical Study of Two-Layer Network Grokking

This empirical study investigates Tian's (2025) feature repulsion theorem in two-layer network grokking, testing its mechanisms and spectral signatures. It observes a clear structure-mechanism dissociation, with the predicted sign rule robustly holding for similar feature pairs despite a strong activation dependence in the spectral signature.

neural networks feature learning grokking deep learning