top of page
AINews (3).png

Zyphra Breakthrough: Can ZAYA1 Prove AMD Is Ready to Rival NVIDIA?

  • Writer: Covertly AI
    Covertly AI
  • Nov 26
  • 4 min read

Zyphra’s unveiling of ZAYA1 marks a significant moment in the AI hardware landscape, proving that large-scale model training can succeed outside NVIDIA’s well-established ecosystem. 


ree

After a year of collaboration with AMD and IBM, Zyphra trained ZAYA1 entirely on AMD Instinct MI300X GPUs, Pensando networking, and the ROCm software stack, demonstrating that AMD’s platform can support major foundation models without exotic configurations or performance compromises (Artificial Intelligence News). The achievement is even more notable because the system was built like a standard enterprise cluster, simple networking, predictable iteration times, and none of the specialized hardware typically associated with cutting-edge AI development. This approach underscores the project’s central message: organizations now have a viable alternative for scaling AI without relying solely on NVIDIA.


ZAYA1’s architecture helps explain why this milestone matters. The model activates 760 million parameters out of a total 8.3 billion and was trained on a massive 12 trillion tokens across three stages, using innovations such as compressed attention, refined routing for expert selection, and optimized residual scaling to stabilize deeper layers (Artificial Intelligence News). Zyphra also optimized the Muon optimizer for AMD hardware by fusing kernels and minimizing unnecessary memory traffic, ensuring that training iterations remained efficient. These design decisions allowed ZAYA1 to deliver competitive results in reasoning, mathematics, and coding, matching or outperforming leading models like Qwen3-4B, Gemma3-12B, Llama-3-8B, and OLMoE. The Mixture-of-Experts structure further reduces memory requirements during inference, making the model particularly attractive for enterprise deployment.


ree

Behind the scenes, Zyphra’s work highlighted the practical realities of running large-scale AI training on AMD hardware. Transitioning workflows from NVIDIA’s CUDA ecosystem to ROCm required reshaping GEMM patterns, adjusting model dimensions, rethinking microbatching, and tuning collective operations for AMD’s InfinityFabric and Pollara networking (Artificial Intelligence News). Long-context training relied on ring and tree-based attention mechanisms to avoid bottlenecks, while storage pipelines were redesigned to balance IOPS and bandwidth needs. Zyphra also introduced Aegis, a monitoring service that automatically detects and resolves issues like NIC glitches or ECC errors, ensuring training jobs, often running for weeks, can continue with minimal disruption. Distributed checkpointing yielded more than ten-fold faster save times, a critical improvement for cluster stability.


For AMD, the success of ZAYA1 reinforces its growing position in the AI infrastructure market. With a market capitalization of $340.84 billion and a strong footprint in data center and GPU markets, AMD is rapidly emerging as a serious contender in AI workloads traditionally dominated by NVIDIA (GuruFocus). The company’s Instinct MI300X GPUs, which offer 192GB of high-bandwidth memory per unit, give engineers more flexibility during early training phases, reducing the need for complex parallelism and enabling more predictable scaling. Financially, AMD maintains a strong balance sheet, though declining operating margins and recent insider selling activity present areas of scrutiny. Even so, Zyphra’s technical report strengthens the case that AMD’s AI hardware can support demanding production-level workloads.


ree

Zyphra’s breakthrough also carries implications for enterprise AI procurement. Rather than encouraging companies to abandon existing NVIDIA clusters, the report suggests a hybrid approach: use NVIDIA systems where they excel, while leveraging AMD’s memory-rich MI300X GPUs and open software stack for specific training stages that benefit from larger headroom and lower cost (Artificial Intelligence News). This strategy reduces supplier risk, increases overall training capacity, and helps organizations keep pace with AI development at a time when GPU availability and pricing remain challenging. As Zyphra and AMD’s collaboration with IBM demonstrates, AMD’s ecosystem, from InfinityFabric to ROCm, has reached a level of maturity that makes such hybrid scaling both feasible and attractive.


In the broader investment and infrastructure landscape, Zyphra’s announcement elevates AMD’s profile at a pivotal moment in AI’s growth. Analysts note that the company’s expanding influence in AI hardware, supported by achievements like ZAYA1, strengthens its long-term prospects despite financial pressures and competitive headwinds (Yahoo Finance). As AI workloads continue to diversify, platforms capable of delivering high performance without vendor lock-in will become increasingly valuable. ZAYA1 not only showcases technical proficiency but also signals a shift in how enterprises may structure their AI infrastructure in the years ahead.


This article was written by the Covertly.AI team. Covertly.AI is a secure, anonymous AI chat that protects your privacy. Connect to advanced AI models without tracking, logging, or exposure of your data. Whether you’re an individual who values privacy or a business seeking enterprise-grade data protection, Covertly.AI helps you stay secure and anonymous when using AI. With Covertly.AI, you get seamless access to all popular large language models - without compromising your identity or data privacy.


Try Covertly.AI today for free at www.covertly.ai, or contact us to learn more about custom privacy and security solutions for your business.  



Works Cited

 

“ZAYA1: AI Model Using AMD GPUs for Training Hits Milestone.” Artificial Intelligence News, www.artificialintelligence-news.com/news/zaya1-ai-model-using-amd-gpus-for-training-hits-milestone/.


 “AMD Hits New AI Training Milestone with ZAYA1 Model.” GuruFocus, www.gurufocus.com/news/3222097/amd-hits-new-ai-training-milestone-with-zaya1-model.


 “Could Zyphra’s AI Breakthrough Reveal New Competitive Strengths for AMD (AMD) in Infrastructure?” Yahoo Finance, ca.finance.yahoo.com/news/could-zyphra-ai-breakthrough-reveal-042442670.html.


Trueman, Charlotte. “AMD Launches MI300X GPUs and MI300A APUs in Generative AI Push.” Data Centre Dynamics, 7 Dec. 2023, www.datacenterdynamics.com/en/news/amd-launches-mi300x-gpu-and-mi300a-apus-in-generative-ai-push/.


Gardner, Timothy, and Max A. Cherney. “Exclusive: US Department of Energy Forms $1 Billion Supercomputer and AI Partnership with AMD.” Reuters, 27 Oct. 2025, www.reuters.com/business/energy/us-department-energy-forms-1-billion-supercomputer-ai-partnership-with-amd-2025-10-27/.


“ZAYA1: Zyphra’s Ground-breaking MoE Model Trained Exclusively on AMD.” Tech-Now.io, tech-now.io/en/blogs/zaya1-zyphras-groundbreaking-moe-model-trained-exclusively-on-amd.


Comments


Subscribe to Our Newsletter

  • Instagram
  • Twitter
bottom of page