ZAYA1-8B Model Redefines AI Reasoning Capabilities

Published on May 8, 2026

ZAYA1-8B has emerged as a significant advancement in AI reasoning, boasting 700 million active parameters and 8 billion total parameters. Built on Zyphra’s innovative MoE++ architecture, it targets complex mathematics and coding challenges. This model underwent extensive pretraining and supervised fine-tuning on an AMD compute platform.

Following its training regimen, ZAYA1-8B employed a four-stage reinforcement learning cascade to enhance its performance further. This included reasoning warmups, a structured RL curriculum, and advanced behavioral training. The introduction of Markovian RSA during test-time provided a novel approach to aggregating reasoning traces, resulting in impressive evaluation scores.

The impact of ZAYA1-8B is profound, narrowing the performance gaps with larger models such as Gemini-2.5 Pro and GPT-5-High. With scores exceeding 91% on AIME’25 and nearly 90% on HMMT’25, it sets a new benchmark for reasoning-focused AI. As it stands, ZAYA1-8B not only redefines expectations but also sets the stage for future innovations in artificial intelligence.

Related News