What’s your bet on the next frontier models (Orion, Gemini 2, Llama-4) vs o1 in coding, math and logical reasoning benchmarks?
Will it have:
Better performance
Similar performance
Worse performance
Curious to hear your answers…
For OpenAI the question is if the increase in size and training on synthetic data will beat the teaching model, without test time compute.
In the comments there is some clarifications related to what I intend for “next-frontier” models.
[Question] Will Orion/Gemini 2/Llama-4 outperform o1
What’s your bet on the next frontier models (Orion, Gemini 2, Llama-4) vs o1 in coding, math and logical reasoning benchmarks?
Will it have:
Better performance
Similar performance
Worse performance
Curious to hear your answers…
For OpenAI the question is if the increase in size and training on synthetic data will beat the teaching model, without test time compute.
In the comments there is some clarifications related to what I intend for “next-frontier” models.