concept
Autoresearch
created 2026-04-19 ai · research · automation · ml
Autoresearch
An open-source project by Andrej Karpathy that turns an AI coding agent into a fully autonomous ML researcher.
How It Works
Point the agent at a small LLM training setup, go to sleep, wake up to ~100 experiments and (hopefully) a better model.
The Loop
- Agent reads all files for context, runs baseline training
- Proposes an idea, edits
train.py, commits - Trains for 5 minutes, reads the metric (val_bpb)
- If improved → keep. If not →
git resetback - Repeat indefinitely — no human in the loop
Three Files
prepare.py— Read-only infrastructure (data, tokenizer, eval). Agent cannot touch thistrain.py— The single file the agent edits. Contains full GPT model, optimizer, training loopprogram.md— The “research program” written by the human. Instructions, constraints, methodology
Key Design Choices
- Fixed 5-minute wall-clock budget makes all experiments directly comparable
- Single GPU, single file, single metric — intentionally minimal
- Human role shifts: from writing Python to writing
program.md— “programming the researcher”
Relevance to Kulify
The program.md pattern is directly analogous to how we use CLAUDE.md as the schema for the Second Brain. The human defines the program, the AI executes and iterates.
Could inspire automated knowledge curation — an “auto-lint” that continuously improves the SB.
Related
- LLM Wiki — Karpathy’s knowledge management pattern
- LangGraph Agent Pattern — similar autonomous agent loops in our projects