RESEARCH29
ZAYA1-8B Technical Report
arXiv CS.AIΒ·May 9, 2026
ZAYA1-8B is a reasoning-focused mixture-of-experts (MoE) model with 700M active parameters, outperforming DeepSeek-R1-0528 on math and coding benchmarks. It was trained from scratch for reasoning on an AMD platform and uses a four-stage RL cascade for post-training.
Read original β