← heapsort-ai

Initialization

1 items

RESEARCHarXiv CS.LG·4/15/2026

Subcritical Signal Propagation at Initialization in Normalization-Free Transformers

This paper studies signal propagation at initialization in transformers using the averaged partial Jacobian norm (APJN) to measure gradient amplification. The theory extends APJN analysis, predicts the asymptotic behavior of APJN at large depth, and explains the subcriticality of normalization-free architectures like Dynamic Tanh and Dynamic erf transformers.

29