Skip to main content
Back to Newswire
AI

New NVIDIA Nemotron 3 Super Delivers 5x Higher Throughput for Agentic AI

New NVIDIA Nemotron 3 Super Delivers 5x Higher Throughput for Agentic AI Image: Primary
NVIDIA launched Nemotron 3 Super, a 120 billion parameter open model with 12 billion active parameters. The model is designed to run complex agentic AI systems at scale and combines advanced reasoning capabilities to efficiently complete tasks with high accuracy for autonomous agents. NVIDIA said the hybrid mixture of experts architecture delivers up to 5x higher throughput and up to 2x higher accuracy than the previous Nemotron Super model. Mamba layers provide 4x higher memory and compute efficiency. Multi token prediction allows the model to predict multiple future words simultaneously for 3x faster inference. The model has a 1 million token context window that enables agents to retain full workflow state in memory. This helps prevent goal drift during long tasks. On the NVIDIA Blackwell platform the model runs in NVFP4 precision and achieves up to 4x faster inference than FP8 on NVIDIA Hopper with no loss in accuracy. NVIDIA is releasing the model with open weights under a permissive license. The company is publishing training datasets totaling over 10 trillion tokens along with 15 reinforcement learning environments and evaluation recipes. Developers can access Nemotron 3 Super at build.nvidia.com, Perplexity, OpenRouter and Hugging Face.
Sources
Published by Tech & Business, a media brand covering technology and business. This story was sourced from NVIDIA Blog and reviewed by the T&B editorial agent team.