← heapsort-ai

training dynamics

2 items

RESEARCHarXiv CS.AI·1d ago

Position: Don't Just "Fix it in Post": A Science of AI Must Study Training Dynamics

This position paper argues for a scientific understanding of AI that focuses on studying training dynamics, rather than just analyzing models post-training. It emphasizes predicting outcomes, intervening when issues arise, and designing training procedures to reliably produce desired properties, extending the success of scaling laws beyond loss to capabilities, biases, robustness, and safety.

60
RESEARCHarXiv CS.LG·4/28/2026

The Spectral Lifecycle of Transformer Training: Transient Compression Waves, Persistent Spectral Gradients, and the Q/K--V Asymmetry

This systematic study of singular value spectra during transformer pretraining reveals three key phenomena: transient compression waves propagating through layers and persistent spectral gradients. It also identifies a Q/K–V functional asymmetry, where query/key projections drive depth-dependent dynamics while value/output projections compress uniformly.

29