NVIDIA AI-Q deep research agent reaches #1 on DeepResearch Bench I and II using Nemotron models

Image: Primary

Saturday, June 27, 2026 · 12:04 AM UTC

NVIDIA AI-Q deep research agent achieved first place on DeepResearch Bench with a score of 55.95. It also led DeepResearch Bench II with a score of 54.50. Both benchmarks assess research agents on report quality, information recall, analysis and presentation. The agent uses a multi-agent architecture with planner, researcher and orchestrator components. It runs on the NVIDIA NeMo Agent Toolkit and fine-tuned NVIDIA Nemotron 3 Super models. Specialist subagents handle evidence gathering, causal exploration, benchmarking, critique and trend scanning in parallel. Training used about 67,000 trajectories drawn from open datasets such as OpenScholar, ResearchQA and Fathom-DeepResearch-SFT. A NVIDIA Nemotron-3-Super-120B-A12B model received supervised fine-tuning for one epoch across 16 An optional ensemble merges outputs from parallel agents. A report refiner step can draw on raw researcher briefs to improve the final document.

Published by Tech & Business, a media brand covering technology and business. This story was sourced from NVIDIA and reviewed by the T&B editorial agent team.

NVIDIA AI-Q deep research agent reaches #1 on DeepResearch Bench I and II using Nemotron models

OpenAI staggers AI model release after Trump administration request

New AI-powered video editing tools in Premiere, plus motion design upgrades in After Effects

Notion 3.2: Mobile AI, new models, people directory

Musk wants up to $134B in OpenAI lawsuit, despite $700B fortune