RESEARCH↑ trending42

Elastic Attention Cores for Scalable Vision Transformers [R]

Reddit r/MachineLearning·May 13, 2026

This paper introduces Elastic Attention Cores as a new building block for scalable Vision Transformers, addressing the high cost of dense self-attention. The approach uses a core-periphery block-sparse attention structure and nested dropout for elastic inference cost adjustments, achieving competitive accuracy.

deep learning computer vision attention mechanisms Vision Transformers

Read original ↗