(I offered to call with Peter because I’m having trouble understanding his positions over text-based discussion. edit: looks like we will call)
I will briefly say re:
Assumptions that highly constrain the model should be first and foremost rather than absent from publicly facing write-ups and only in appendices. Perturbation tests should be publicly shown as should variances in assumptions. It is true that I don’t think this represents high-quality quantitative work, but that doesn’t mean I don’t think any of it is relevant, even if I think the presentation does not demonstrate a good sense of responsibility. The hype and advertising-ness of the presentation is consistent, and while I have been trying to avoid particular comment thereon, it is a constant weight hanging over the discussion. Publication of a model and breathless speculation are not the same, and I really do not want to litigate the speculative aspects but they are the main thing everyone is talking about including in the New York Times.
I agree in an ideal world the timelines model would be more rigorous with better structure, more ablation studies, more justifications, etc. However this would have taken a lot longer and my guess is that we would have ended up with similar forecasts in terms of their qualitative conclusions about how plausible AGI in/by 2027 is (though of course, I’m not certain). We didn’t want to let perfect be the enemy of the good/useful (i.e. good/useful in our opinion; seems like the crux may be you disagree that it’s useful).
Apologies if any part of the media stuff seemed like it overclaimed regarding our work.
“we would have ended up with similar forecasts in terms of their qualitative conclusions about how plausible AGI in/by 2027 is” if this is with regard to perturbation studies of the two super-exponential terms, I believe based on the work I’ve shared in part here it is false, but happy to agree to disagree on the rest :)
I’m curious to hear what conclusions you think we would have came to & should come to. I’m skeptical that they would have been qualitatively different. Perhaps you are going to argue that we shouldn’t put much credence in the superexponential model? What should we put it in instead? Got a better superexponential model for us? Or are you going to say we should stick to exponential?
Thanks for engaging, I’m afraid I can’t join the call due to a schedule conflict but I look forward to hearing about it from Eli!
Ah, I was trying to avoid implying a lack of integrity in the forecasting effort, and as a result ended up implying knowing other things about your mental state.
To restate: I do not think a forecaster not previously committed to achieving a result with a median around that point would have, after doing a perturbation analysis that would display how dominant the two superexponential terms are in predetermining that specific outcome, presented those conclusions without hammering the point that those two superexponential terms are completely determining the topline result and that no other parameters make a meaningful contribution, whether forecasts of compute availability, current capabilities, capabilities required for SC, or otherwise.
Sorry if my previous comment implied something else!
(I offered to call with Peter because I’m having trouble understanding his positions over text-based discussion. edit: looks like we will call)
I will briefly say re:
I agree in an ideal world the timelines model would be more rigorous with better structure, more ablation studies, more justifications, etc. However this would have taken a lot longer and my guess is that we would have ended up with similar forecasts in terms of their qualitative conclusions about how plausible AGI in/by 2027 is (though of course, I’m not certain). We didn’t want to let perfect be the enemy of the good/useful (i.e. good/useful in our opinion; seems like the crux may be you disagree that it’s useful).
Apologies if any part of the media stuff seemed like it overclaimed regarding our work.
Looking forward to getting clarity!
“we would have ended up with similar forecasts in terms of their qualitative conclusions about how plausible AGI in/by 2027 is” if this is with regard to perturbation studies of the two super-exponential terms, I believe based on the work I’ve shared in part here it is false, but happy to agree to disagree on the rest :)
I’m curious to hear what conclusions you think we would have came to & should come to. I’m skeptical that they would have been qualitatively different. Perhaps you are going to argue that we shouldn’t put much credence in the superexponential model? What should we put it in instead? Got a better superexponential model for us? Or are you going to say we should stick to exponential?
Thanks for engaging, I’m afraid I can’t join the call due to a schedule conflict but I look forward to hearing about it from Eli!
Ah, I was trying to avoid implying a lack of integrity in the forecasting effort, and as a result ended up implying knowing other things about your mental state.
To restate: I do not think a forecaster not previously committed to achieving a result with a median around that point would have, after doing a perturbation analysis that would display how dominant the two superexponential terms are in predetermining that specific outcome, presented those conclusions without hammering the point that those two superexponential terms are completely determining the topline result and that no other parameters make a meaningful contribution, whether forecasts of compute availability, current capabilities, capabilities required for SC, or otherwise.
Sorry if my previous comment implied something else!