Thanks for taking the time to write up your reflections. I agree that the before/after distinction seems especially important (‘only one shot to get it right’), and a crux that I expect many non-readers not to know about the EY/NS worldview.
I’m wondering about your take in this passage:
In the book they make an analogy to a ladder where every time you climb it you get more rewards but once you reach the top rung then the ladder explodes and kills everyone. However, our experience so far with AI does not suggest that this is a correct world view.
I’m curious what about the world’s experience with AI seems to falsify it from your POV? / casts doubt upon it? Is it about believing that systems have become safer and more controlled over time?
(Nit, but the book doesn’t posit that the explosion happens at the top rung; in that case, we could just avoid ever reaching the top rung. It posits that the explosion happens at a not-yet-known rung, and so each successive rung climb carries some risk of blow-up. I don’t expect this distinction is load-bearing for you though)
(Edit: my nit is wrong as written! Thanks Boaz—he’s right that the book’s argument is actually about the top of the ladder, I was mistaken—though with the distinction I was trying to point at, of not knowing where the top is, so from a climber’s perspective there’s no way of just avoiding that particular rung)
p.s. I just realized that I did not answer your question:
> Is it about believing that systems have become safer and more controlled over time?
No this is not my issue here. While I hope it won’t be the case, systems could well become more risky and less controlled over time. I just believe that if that is the case then it would be observable via seeing increased rate of safety failures far before we get to the point where failure means that literally everyone on earth dies.
What’s the least-worrying thing we may see that you’d expect to lead to a pause in development?
(this isn’t a trick question; I just really don’t know what kind of thing gradualists would consider cause for concern, and I don’t find official voluntary policies to be much comfort, since they can just be changed if they’re too inconvenient. I’m asking for a prediction, not any kind of commitment!)
See my response to Eliezer. I don’t think it’s one shot—I think there are going to be both successes and failures along the way that would give us information that we will be able to use.
Even self improvement is not a singular event—already AI scientists are using tools such as codex or claude code to improve their own productivity. As models grow in capability, the benefit of such tools will grow, but it is not necessarily one event. Also, I think that we would likely require this improvement just to sustain the exponential at its current rate- it would not be sustainable to continue the growth in hiring and so increasing productivity via AI would be necessary.
Re the nit: In page 205 they say “Imagine that evert competing AI company is climbing a ladder in the dark. At every rung but the top one, they get five times as much money … But if anyone reaches the top rung, the ladder explodes and kills everyone. Also nobody knows where the ladder ends.”
I’ll edit a bit the text so it’s clear you don’t know when it ends.
Thanks for taking the time to write up your reflections. I agree that the before/after distinction seems especially important (‘only one shot to get it right’), and a crux that I expect many non-readers not to know about the EY/NS worldview.
I’m wondering about your take in this passage:
I’m curious what about the world’s experience with AI seems to falsify it from your POV? / casts doubt upon it? Is it about believing that systems have become safer and more controlled over time?
(Nit, but the book doesn’t posit that the explosion happens at the top rung; in that case, we could just avoid ever reaching the top rung. It posits that the explosion happens at a not-yet-known rung, and so each successive rung climb carries some risk of blow-up. I don’t expect this distinction is load-bearing for you though)
(Edit: my nit is wrong as written! Thanks Boaz—he’s right that the book’s argument is actually about the top of the ladder, I was mistaken—though with the distinction I was trying to point at, of not knowing where the top is, so from a climber’s perspective there’s no way of just avoiding that particular rung)
p.s. I just realized that I did not answer your question:
> Is it about believing that systems have become safer and more controlled over time?
No this is not my issue here. While I hope it won’t be the case, systems could well become more risky and less controlled over time. I just believe that if that is the case then it would be observable via seeing increased rate of safety failures far before we get to the point where failure means that literally everyone on earth dies.
What’s the least-worrying thing we may see that you’d expect to lead to a pause in development?
(this isn’t a trick question; I just really don’t know what kind of thing gradualists would consider cause for concern, and I don’t find official voluntary policies to be much comfort, since they can just be changed if they’re too inconvenient. I’m asking for a prediction, not any kind of commitment!)
See my response to Eliezer. I don’t think it’s one shot—I think there are going to be both successes and failures along the way that would give us information that we will be able to use.
Even self improvement is not a singular event—already AI scientists are using tools such as codex or claude code to improve their own productivity. As models grow in capability, the benefit of such tools will grow, but it is not necessarily one event. Also, I think that we would likely require this improvement just to sustain the exponential at its current rate- it would not be sustainable to continue the growth in hiring and so increasing productivity via AI would be necessary.
Re the nit: In page 205 they say “Imagine that evert competing AI company is climbing a ladder in the dark. At every rung but the top one, they get five times as much money … But if anyone reaches the top rung, the ladder explodes and kills everyone. Also nobody knows where the ladder ends.”
I’ll edit a bit the text so it’s clear you don’t know when it ends.
(Appreciate the correction re my nit, edited mine as well)