Overview
Andrej Karpathy demonstrates that the cost to train GPT-2-level models has dropped dramatically, with 600X cost reduction achieved over 7 years through hardware and software improvements.
The Breakdown
- OpenAI’s original GPT-2 training in 2019 cost $43,000 using 32 TPU v3 chips for 7 days to achieve a 0.256525 CORE benchmark score
- Karpathy’s optimized nanochat implementation now surpasses that same performance in 3 hours for $73 on a single 8XH100 node
- The improvement represents a 600X cost reduction, with training costs falling approximately 2.5X every year since 2019
- Performance is measured using the CORE score metric - an ensemble evaluation across 22 different AI benchmarks including ARC and MMLU