NVIDIA has partnered with Mistral AI to unveil the Mistral 3 family of open-source multilingual and multimodal models, designed to deliver industry-leading accuracy and efficiency for enterprise AI workloads. Released on December 2, 2025, these models harness a mixture-of-experts (MoE) architecture that activates only the most impactful model parts per token, enabling scalable and practical AI solutions without sacrificing accuracy.
Performance and Efficiency Breakthroughs
The flagship Mistral Large 3 model features 41 billion active parameters with 675 billion total parameters and an extensive 256K token context window, offering unprecedented scalability. Leveraging NVIDIA GB200 NVL72 systems with NVLink’s coherent memory domain and advanced parallelism, Mistral Large 3 achieves a 10x performance boost compared to the previous NVIDIA H200 generation. This translates into superior user experiences, reduced per-token costs, and greater energy efficiency for AI training and inference.
Edge and Open-Source Accessibility
Beyond large models, Mistral AI also released nine smaller models optimized for NVIDIA’s edge platforms—including Spark, RTX PCs, and Jetson devices—enabling AI capabilities across a wide range of hardware. Developers can experiment with the compact Mistral 3 suite via open-source frameworks like Llama.cpp and Ollama, opening new possibilities for efficient AI deployment at the edge.
Enterprise Customization and Ecosystem Integration
NVIDIA is integrating Mistral 3 with its open-source NeMo tools—Data Designer, Customizer, Guardrails, and NeMo Agent Toolkit—to streamline AI agent development and customization, accelerating the journey from prototype to production. NVIDIA has further optimized multiple inference frameworks, such as TensorRT-LLM, SGLang, and vLLM, to maximize model performance and efficiency across cloud and edge environments. The Mistral 3 family is also expected to become deployable as NVIDIA NIM microservices soon, facilitating flexible AI service orchestration.
This collaboration marks a significant milestone in democratizing frontier AI technologies, empowering enterprises to innovate with scalable, efficient, and adaptable AI models from cloud data centers to edge devices worldwide.