As for AI progress being slow, I think that without theoretical breakthroughs like neuralese AI progress might come to a stop or at building more and more expensive models. Indeed, the two ARC-AGI benchmarks[1]could have demonstrated a pattern where maximal capabilities scale[2]linearly or multilinearlywith ln(cost/task).
If this effect persists deep into the future of transformer LLMs, then most AI companies could run into the limits of the paradigm well before researching the next one and losing any benefits of having a concise CoT.
Unlike GPT-5-mini, maximal capabilities of o4-mini, o3, GPT-5, Claude Sonnet 4.5 in the ARC-AGI-1 benchmark scale more steeply and intersect the frontier at GPT-5(high).
As for AI progress being slow, I think that without theoretical breakthroughs like neuralese AI progress might come to a stop or at building more and more expensive models. Indeed, the two ARC-AGI benchmarks[1] could have demonstrated a pattern where maximal capabilities scale[2] linearly or multilinearly with ln(cost/task).
If this effect persists deep into the future of transformer LLMs, then most AI companies could run into the limits of the paradigm well before researching the next one and losing any benefits of having a concise CoT.
The second benchmark demonstrates a similar effect in high costs, but there is no straight line in the low cost mode.
Unlike GPT-5-mini, maximal capabilities of o4-mini, o3, GPT-5, Claude Sonnet 4.5 in the ARC-AGI-1 benchmark scale more steeply and intersect the frontier at GPT-5(high).
This would be great news if true!