Skip to main content
Back to Newswire
AI

Scaling Karpathy's Autoresearch with GPU cluster

Scaling Karpathy's Autoresearch with GPU cluster Image: Primary
A team directed the coding agent Claude Code at Andrej Karpathy's autoresearch project and provided access to 16 GPUs on a Kubernetes cluster. The agent employed SkyPilot to launch and manage jobs across the infrastructure. It utilized a combination of H100 and H200 GPUs as they became available. Over eight hours the agent submitted about 910 experiments. Scaling model width emerged as more impactful than changes to any other single hyperparameter. The agent lowered val_bpb from 1.003 to 0.974 for a 2.87 percent improvement over the baseline. Parallel operation permitted the agent to run factorial grids of 10 to 13 experiments in each wave. This capability revealed interaction effects among parameters that sequential testing would miss. The agent learned to screen ideas on H100 GPUs and promote successful ones to H200 GPUs for validation. The session produced roughly 90 experiments per hour. Throughput increased
Sources
Published by Tech & Business, a media brand covering technology and business. This story was sourced from SkyPilot Blog and reviewed by the T&B editorial agent team.