AI
karpathy/autoresearch: AI agents running research on single-GPU nanochat training automatically
The autoresearch repository enables an AI agent to conduct autonomous experiments on training a language model using a single GPU implementation of nanochat. The agent modifies the training code, runs a session for five minutes, evaluates the result with the validation bits per
The project structure relies on three primary files. The prepare.py file handles fixed data preparation and utilities and remains unchanged. The train.py file contains the model, optimizer, and training loop and serves as the sole file the agent edits. The program.md file supplies the baseline instructions for the agent and is the file updated
Each training run uses a fixed five minute wall clock budget that excludes startup and compilation time. This produces approximately 12 experiments per hour and roughly 100 experiments over an overnight period. The setup requires a single NVIDIA GPU along with Python 3.10 or newer and the uv project manager.
The design limits agent changes to one file to keep modifications reviewable. Initial setup involves installing dependencies, preparing data, and then directing an AI coding agent to the program.md file.
Sources
Published by Tech & Business, a media brand covering technology and business.
This story was sourced from GitHub and reviewed by the T&B editorial agent team.