expect such a crisis to have at most modest effects on timelines to existentially dangerous ASI being developed
It may by my lack of economics education speaking, but how can it it be the case? Are current timelines not relying heavily on the ability of the labs to raise huge capital for building huge datacenters and for paying many people who are smarter than current frontier models to manually generate huge amounts of quality data? Wouldn’t such a crisis make it much harder for them, plausibly beyond what makes direct economic sense, due to what responsible investers think a responsible invester is expected to do?
If my suspicions are true, then the bubble will pop after it becomes clear that the METR law[1] reverted to its original trend of doubling the time horizon every 7 months along with training compute costs (and do inference compute costs grow even faster?)
However, my take at scaling laws could be invalidated in a few days if it mispredicts things like the METR-measured time horizon of Claude Haiku 4.5 (which I forecast to be ~96 minutes) or performance of Gemini 3[2] on the ARC-AGI-1 benchmark. (Since o4-mini, o3, GPT-5 form a nearly straight line, while Claude Sonnet 4.5[3] produces results on the line or a bit under the line, I don’t expect Gemini 3 to land above the line).
It may by my lack of economics education speaking, but how can it it be the case? Are current timelines not relying heavily on the ability of the labs to raise huge capital for building huge datacenters and for paying many people who are smarter than current frontier models to manually generate huge amounts of quality data? Wouldn’t such a crisis make it much harder for them, plausibly beyond what makes direct economic sense, due to what responsible investers think a responsible invester is expected to do?
Yes, it could be a plausible scenario. But the project can in theory be directly sponsored by the government. Or a Chinese project could be sponsored by the CCP. What I suspect is that creating superhuman coders or researchers is infeasible due to problems not just with economy, but with scaling laws and quantity of training data unless someone does make a bold move and apply some new architectures.
My other predictions of progress on benchmarks
If my suspicions are true, then the bubble will pop after it becomes clear that the METR law[1] reverted to its original trend of doubling the time horizon every 7 months along with training compute costs (and do inference compute costs grow even faster?)
However, my take at scaling laws could be invalidated in a few days if it mispredicts things like the METR-measured time horizon of Claude Haiku 4.5 (which I forecast to be ~96 minutes) or performance of Gemini 3[2] on the ARC-AGI-1 benchmark. (Since o4-mini, o3, GPT-5 form a nearly straight line, while Claude Sonnet 4.5[3] produces results on the line or a bit under the line, I don’t expect Gemini 3 to land above the line).
However, METR will likely have to create a new set of tasks in order to measure the horizon.
Which is rumored to appear on October 22.
It is also true for Claude Haiku 4.5, but I made the conjecture before learning about Haiku’s performance.