← heapsort
RESEARCH↑ trending42

Elastic Attention Cores for Scalable Vision Transformers [R]

Reddit r/MachineLearningΒ·May 13, 2026
Elastic Attention Cores for Scalable Vision Transformers [R]

This paper introduces Elastic Attention Cores as a new building block for scalable Vision Transformers, addressing the high cost of dense self-attention. The approach uses a core-periphery block-sparse attention structure and nested dropout for elastic inference cost adjustments, achieving competitive accuracy.

Read original β†—