GPT-5.4 Pro Sets New FrontierMath Record, Solves First-Ever Tier 4 Problem

OpenAI's GPT-5.4 Pro set a new high-water mark on FrontierMath, Epoch AI's benchmark of research-grade mathematics, scoring 50% on Tiers 1 through 3 and 38% on Tier 4. The model solved one Tier 4 problem that no prior AI system had cracked. Epoch's analysis found it likely located a 2011 preprint that allowed it to shortcut significant portions of the intended work. On the harder FrontierMath: Open Problems benchmark, GPT-5.4 Pro did not solve any problems. Epoch noted the model made novel observations on one problem but characterized them as relatively uninteresting. Prior to GPT-5.4, AI models had solved roughly 2% of FrontierMath problems overall. The jump to 50% represents the largest single-model improvement since the benchmark launched.