Published on May 8, 2026
ZAYA1-8B has emerged as a significant advancement in AI reasoning, boasting 700 million active parameters and 8 billion total parameters. Built on Zyphra’s innovative MoE++ architecture, it targets complex mathematics and coding challenges. This model underwent extensive pretraining and supervised fine-tuning on an AMD compute platform.
Following its training regimen, ZAYA1-8B employed a four-stage reinforcement learning cascade to enhance its performance further. This included reasoning warmups, a structured RL curriculum, and advanced behavioral training. The introduction of Markovian RSA during test-time provided a novel approach to aggregating reasoning traces, resulting in impressive evaluation scores.
The impact of ZAYA1-8B is profound, narrowing the performance gaps with larger models such as Gemini-2.5 Pro and GPT-5-High. With scores exceeding 91% on AIME’25 and nearly 90% on HMMT’25, it sets a new benchmark for reasoning-focused AI. As it stands, ZAYA1-8B not only redefines expectations but also sets the stage for future innovations in artificial intelligence.
Related News
- DeepSeek Unveils AI Model V4, Challenging US Dominance
- X-Energy’s $1.02 Billion IPO Sparks 27% Share Surge
- New AI Pipeline Transforms Kindle Highlights into Structured Summaries
- Ray-Ban Meta Smart Glasses See Major Price Drop
- Pioneer Revolutionizes LLM Customization with Simple Prompt
- AI Co-Clinician Paves New Path for Healthcare Transformation