I’ll say various facts as best as I can recall and allow you and others to decide how bad/deceptive the time horizon prediction graph was.
The prediction on the graph was formed by extrapolating a superexponential with a 15% decay. This was set to roughly get SC at the right time, based on an estimate for what time horizon is needed for SC that is similar to my median in the timelines forecast. This is essentially a simplified version of our time horizon extension model that doesn’t account for AI R&D automation. Or another way to view this is that we crudely accounted for AI R&D automation by raising the decay.
This was not intended to represent our overall median forecast as that is later, but instead to represent roughly the trajectory that happens in AI 2027.
As is shown in titotal’s post, the graph is barely in distribution for the trajectories of our timelines model which reach SC in Mar 2027, it’s certainly not central. We did not check this before the AI 2027 release.
Why didn’t we use a central trajectory from our timelines model rather than the simplified version? This was on my TODO list, but I ran out of time. As you can imagine, we were working right up until a deadline and didn’t get to many TODOs that would have been great to have. But very likely I should have prioritized it more highly, so this is my mistake.
Or we should have more clearly labeled that the graph was not generated via the timelines model.
If we had the correct graph, then the new model releases would have been a bit above our predicted trend, rather than right on it. So it should be a very slight update toward the plausibility of shorter timelines than AI 2027.
This is essentially a simplified version of our time horizon extension model that doesn’t account for AI R&D automation. Or another way to view this is that we crudely accounted for AI R&D automation by raising the decay.
Why did you simplify the model for a graph? You could have plotted a trajectory to begin with, instead of making a bespoke simplification. Is it because you wanted to “represent roughly the trajectory that happens in AI 2027”? I get that AI 2027 is a story, but why not use your real model to sample a trajectory—perhaps rejection sampling until you get one of the more aggressive possibilities?
Or you could even rejection sample the model until you get one that matches AI 2027 pretty closely, and then draw that curve’s projection (and retrojection—wait is that even a word).
I’m currently watching the tension between “this is just a story [which doesn’t have hard data behind it, take it with a big grain of salt]” and “here’s some math supporting our estimates [but wasn’t actually used for our plots or the story in any direct way].” I’m worried that the math lends credibility without being that relevant to the real decisions.
Why not use the real model to sample a trajectory? We should indeed have done that, but it was annoying and would have taken more time & this 15% thing seemed a good enough approximation given the massive error bars and uncertainty involved. It’s not like we had the real trajectory sample ready to go and decided to do the 15% thing instead.
I’m currently watching the tension between “this is just a story [which doesn’t have hard data behind it, take it with a big grain of salt]” and “here’s some math supporting our estimates [but wasn’t actually used for our plots or the story in any direct way].” I’m worried that the math lends credibility without being that relevant to the real decisions.
I agree we should have been more clear about various things. Like, we have our scenario. Why did we choose to depict takeoff in 2027? Because at the time we were writing, that was my median. (Over the course of writing, my median shifted to 2028. Eli’s meanwhile was always longer, more like 2032 iirc. This is awkward but nbd, we all still think 2027 is plausible. We even said in the very first footnote that our actual medians were somewhat later than what the scenario depicts, and that the scenario depicts something more like our mode. And iirc I said it to Kevin Roose as well in the initial interview.) OK, so why should people take seriously our median/mode/etc. and our views on timelines more generally? Well, we weren’t claiming people should trust us absolutely or anything like that. We think that we’ve done an unusually high amount of thinking and research on the topic and have good track records, but we aren’t asking people to trust us. We put up our latest thoughts on timelines (at least, latest-at-the-time-of-writing, so early 2025) on the webpage, for all to see and critique. We are hoping and planning to continue to improve our models of everything.
I’ll say various facts as best as I can recall and allow you and others to decide how bad/deceptive the time horizon prediction graph was.
The prediction on the graph was formed by extrapolating a superexponential with a 15% decay. This was set to roughly get SC at the right time, based on an estimate for what time horizon is needed for SC that is similar to my median in the timelines forecast. This is essentially a simplified version of our time horizon extension model that doesn’t account for AI R&D automation. Or another way to view this is that we crudely accounted for AI R&D automation by raising the decay.
This was not intended to represent our overall median forecast as that is later, but instead to represent roughly the trajectory that happens in AI 2027.
As is shown in titotal’s post, the graph is barely in distribution for the trajectories of our timelines model which reach SC in Mar 2027, it’s certainly not central. We did not check this before the AI 2027 release.
Why didn’t we use a central trajectory from our timelines model rather than the simplified version? This was on my TODO list, but I ran out of time. As you can imagine, we were working right up until a deadline and didn’t get to many TODOs that would have been great to have. But very likely I should have prioritized it more highly, so this is my mistake.
Or we should have more clearly labeled that the graph was not generated via the timelines model.
If we had the correct graph, then the new model releases would have been a bit above our predicted trend, rather than right on it. So it should be a very slight update toward the plausibility of shorter timelines than AI 2027.
Thanks, I appreciate your comments.
Why did you simplify the model for a graph? You could have plotted a trajectory to begin with, instead of making a bespoke simplification. Is it because you wanted to “represent roughly the trajectory that happens in AI 2027”? I get that AI 2027 is a story, but why not use your real model to sample a trajectory—perhaps rejection sampling until you get one of the more aggressive possibilities?
Or you could even rejection sample the model until you get one that matches AI 2027 pretty closely, and then draw that curve’s projection (and retrojection—wait is that even a word).
I’m currently watching the tension between “this is just a story [which doesn’t have hard data behind it, take it with a big grain of salt]” and “here’s some math supporting our estimates [but wasn’t actually used for our plots or the story in any direct way].” I’m worried that the math lends credibility without being that relevant to the real decisions.
Why not use the real model to sample a trajectory? We should indeed have done that, but it was annoying and would have taken more time & this 15% thing seemed a good enough approximation given the massive error bars and uncertainty involved. It’s not like we had the real trajectory sample ready to go and decided to do the 15% thing instead.
I agree we should have been more clear about various things. Like, we have our scenario. Why did we choose to depict takeoff in 2027? Because at the time we were writing, that was my median. (Over the course of writing, my median shifted to 2028. Eli’s meanwhile was always longer, more like 2032 iirc. This is awkward but nbd, we all still think 2027 is plausible. We even said in the very first footnote that our actual medians were somewhat later than what the scenario depicts, and that the scenario depicts something more like our mode. And iirc I said it to Kevin Roose as well in the initial interview.) OK, so why should people take seriously our median/mode/etc. and our views on timelines more generally? Well, we weren’t claiming people should trust us absolutely or anything like that. We think that we’ve done an unusually high amount of thinking and research on the topic and have good track records, but we aren’t asking people to trust us. We put up our latest thoughts on timelines (at least, latest-at-the-time-of-writing, so early 2025) on the webpage, for all to see and critique. We are hoping and planning to continue to improve our models of everything.
Yes, I think this would have been quite good.