Epoch confirms GPT5.4 Pro solved a frontier math open problem

(epoch.ai)

69 points by in-silico1 hour ago

4 comments

6thbit54 minutes ago
> Subsequent to this solve, we finished developing our general scaffold for testing models on FrontierMath: Open Problems. In this scaffold, several other models were able to solve the problem as well: Opus 4.6 (max), Gemini 3.1 Pro, and GPT-5.4 (xhigh).<p>Interesting. Whats that “scaffold”? A sort of unit test framework for proofs?
- inkysigma20 minutes ago
  I think in this context, scaffolds are generally the harness that surrounds the actual model. For example, any tools, ways to lay out tasks, or auto-critiquing methods.<p>I think there's quite a bit of variance in model performance depending on the scaffold so comparisons are always a bit murky.
  - readitalready6 minutes ago
    Usually involves a lot of agents and their custom contexts or system prompts.
karmasimida19 minutes ago
No denial at this point, AI could produce something novel, and they will be doing more of this moving forward.
- leptons14 minutes ago
  [flagged]
  - snypher4 minutes ago
    Your analogy falls apart if we consider the number wasn't on the clock face.
osti16 minutes ago
Seems like the high compute parallel thinking models weren't even needed, both the normal 5.4 and gemini 3.1 pro solved it. Somehow Gemini 3 deepthink couldn't solve it.
renewiltord16 minutes ago
Fantastic news! That means with the right support tooling existing models are already capable of solving novel mathematics. There’s probably a lot of good mathematics out there we are going to make progress on.