I don’t think this info was about o3 (please correct me if I’m wrong). While this suggests not all of them were from the first tier, it would be much better to know what it actually was. Especially, since the most famous quotes about FrontierMath (“extremely challenging” and “resist AIs for several years at least”) were about the top 25% hardest problems, the accuracy on that set seems more important to update on with them. (not to say that 25% is a small feat in any case).
3⁄9 Although o3 solved problems in all three tiers, it likely still struggles on the most formidable Tier 3 tasks—those “exceptionally hard” challenges that Tao and Gowers say can stump even top mathematicians.
I don’t think this info was about o3 (please correct me if I’m wrong). While this suggests not all of them were from the first tier, it would be much better to know what it actually was. Especially, since the most famous quotes about FrontierMath (“extremely challenging” and “resist AIs for several years at least”) were about the top 25% hardest problems, the accuracy on that set seems more important to update on with them. (not to say that 25% is a small feat in any case).
Although it’s not made explicit, we can deduce that it’s at least in part about o3 from this earlier Tweet from the same person:
https://x.com/ElliotGlazer/status/1870613418644025442