How long until the sun (starts to) get eaten? 10th/50th/90th percentile: 3y, 12y, 37y.
How long until an AI reaches Elo 4000 on codeforces? 10/50/90: 9mo, 2.5y, 11.5y
About one month ago, aka 6 months after I wrote this, OpenAI’s model won the ICPC world finals, which I guess is sorta equivalent to Elo 4000 on codeforces, given that it won by a significant margin.
(This updates me to thinking that both (1) AI capabilities increase faster than I expected, and (2) competetive programming requires less general intelligence than I expected.)
Absent any coordinated slowdown, my new 10/50/90 guess for dyson sphere level capability is: 1y, 3.3y, 18y.
(I still find it hard to predict whether progress will continue continuous or whether there will be at least one capability leap.)
Just surpassing the limits of human capability at something is not any update at all at this point, because AlphaZero (with frontier LLMs using much more compute). Programming seems less of an update than natural language proof IMO, because for programming you can get away with straightforward verifiable rewards, which can’t be manually formulated for many crucial real world tasks. But natural language proof IMO requires valid informal proofs rather than merely correct or formally winning answers, which more directly demonstrates that even with a more fuzzy kind of correctness feedback LLMs can still be trained to operate at the limits of human capability.
I definitely have to update here—that’s just law of probability. Maybe you don’t have to update much if you already expected to have superhuman competetive programming around now.
But also this isn’t the only update that informs my new timelines. I was saying more like “look I wrote down advanced predictions and it was actually useful to me”, rather than intending to give an epistemically legible account of my timeline models.
I updated my timelines
7 months ago, I wrote down those AI predictions:
About one month ago, aka 6 months after I wrote this, OpenAI’s model won the ICPC world finals, which I guess is sorta equivalent to Elo 4000 on codeforces, given that it won by a significant margin.
(This updates me to thinking that both (1) AI capabilities increase faster than I expected, and (2) competetive programming requires less general intelligence than I expected.)
Absent any coordinated slowdown, my new 10/50/90 guess for dyson sphere level capability is: 1y, 3.3y, 18y.
(I still find it hard to predict whether progress will continue continuous or whether there will be at least one capability leap.)
Just surpassing the limits of human capability at something is not any update at all at this point, because AlphaZero (with frontier LLMs using much more compute). Programming seems less of an update than natural language proof IMO, because for programming you can get away with straightforward verifiable rewards, which can’t be manually formulated for many crucial real world tasks. But natural language proof IMO requires valid informal proofs rather than merely correct or formally winning answers, which more directly demonstrates that even with a more fuzzy kind of correctness feedback LLMs can still be trained to operate at the limits of human capability.
So in my view it’s specifically this year’s natural language proof IMO results (from OpenAI and GDM) that lend a lot of credence to the following recent claim by Sholto Douglas of Anthropic:
I definitely have to update here—that’s just law of probability. Maybe you don’t have to update much if you already expected to have superhuman competetive programming around now.
But also this isn’t the only update that informs my new timelines. I was saying more like “look I wrote down advanced predictions and it was actually useful to me”, rather than intending to give an epistemically legible account of my timeline models.