The go/no-go model is not meant to show that a P(doom) of up to 97% is “acceptable” (or at least it would risk being highly misleading to say that). The model is only meant to show that up to that level of risk, launching superintelligence increases life expectancy under the given assumptions. That model ignores many important factors (such as distributional considerations and diminishing marginal utility in QALYs), which is why a series of more complicated models are introduced that take into account some of these other factors. (Even the most elaborate of the models introduced is still only very schematic and leaves out much that is relevant, as all formal models of this sort do. “For these and other reasons, the preceding analysis—although it highlights several relevant considerations and tradeoffs—does not on its own imply support for any particular policy prescriptions.”).
By the way, there may also be reasons to regard implementing a lottery that would involve going out and killing some random subset of the human population differently from allowing technological progress to continue—even if we were to stipulate that the two cases were exactly parallel with respect to some set of consequentialist outcome metrics. (Also, while in your example, using randomization would equalize people’s chances or ex ante expected lifespans, it would lead to radically uneven ex post outcomes. Some people with egalitarian intuitions care about inequality of outcomes, not only inequality of chances or opportunities—especially in cases where the inequality of outcomes is not connected to personal motivations, efforts, or choices.)
The go/no-go model is not meant to show that a P(doom) of up to 97% is “acceptable”
I should have been clearer, yes. I meant that the 97% deals are acceptable to your go/no-go model, not to you or your later models.
However, I think my arguments apply equally to your later models, just with P(doom) different from 97%. (See below.)
a series of more complicated models are introduced that take into account some of these other factors.
Thank you for running the more complicated models. (And, in case unclear, I did read all your article before making any comments.)
Do the models help us understand how we should act? Here is how I look at it --
It’s difficult to get intuition about how good or bad the optimal ASI launch times are, because the envisioned situation is so far from experience.
In each model, there is a remaining P(doom) at the model-optimal launch time, call it R%. R% is sometimes large, sometimes small.
From a perspective that stays entirely within the specific person-affecting framework used by the models, the following deals have equal value
Deal 1: At ASI launch, all humans are killed with R% probability, but life expectancy becomes superhuman with probability 100% - R%.
Deal 2: At ASI launch, R% of humans are killed with certainty, which is the cost of providing superhuman life expectancy for the remaining 100% - R%.
What happens if we put one foot outside model assumptions and begin to care some about future generations? Are lessons of the models robust to a small step away from their assumptions?
If we care about future generations, this tiebreaks. Deal 1 seems worse than Deal 2, since under Deal 1 there may easily be no future generations, while under Deal 2 future generations can repopulate, have happiness, and so on. Even if one believes Deal 1 is not strictly worse than Deal 2, Deal 1 hardly seems much better.
Let’s now return to viewing things however we actually see them, without adopting a specific framework whose rules we restrict ourselves to follow. It’s difficult for me to get an intuitive grasp on Deal 1, but Deal 2 is easier to understand. There are a lots of similar historical precedents. Deal 2 is explicitly killing a large proportion of humans to let the others prosper more. It’s horrible.
We have seen that stepping even slightly outside the model’s person-affecting stance, Deal 1 is worse than Deal 2. Yet, Deal 2 is plainly awful. So what is the value of simulating optimal times for models to accept Deal 1? It is like modeling the ″best″ time to enact horror.
I should mention that the models often find high R% to be optimal. For example, with 50% initial P(doom) and 5%/yr safety progress, the model you used in Table 3 would launch when P(doom) remains 32% with overall life expectancy of 774 years, instead of waiting a handful of decades for P(doom) to fall to near 0. I noticed this when I coded your Appendix A model for the sensitivity analysis in my other comment (which also found a small error in your Table 3 -- see that comment for detail.)
Some people …. care about inequality of outcomes, not only inequality of chances
I neglected this, thank you for raising it. I don’t think it undermines my argument, but let me know if you disagree.
Infinite regress issue?
I’m apologize if you addressed this somewhere or if I misunderstand, but is there an infinite regress issue with your models?
Take three times, T0, T1, and T2, each a year apart. Let average life expectancy be 40 years among the living at each time. Suppose the model is run at T0 and it says “Delay ASI launch 3 years”. However, if the model is run again at T1, it will still say “Delay ASI launch 3 years” because the model explicitly only cares about the living. And the same at T2 or later. So one would never reach the time to launch ASI.
The go/no-go model is not meant to show that a P(doom) of up to 97% is “acceptable” (or at least it would risk being highly misleading to say that). The model is only meant to show that up to that level of risk, launching superintelligence increases life expectancy under the given assumptions. That model ignores many important factors (such as distributional considerations and diminishing marginal utility in QALYs), which is why a series of more complicated models are introduced that take into account some of these other factors. (Even the most elaborate of the models introduced is still only very schematic and leaves out much that is relevant, as all formal models of this sort do. “For these and other reasons, the preceding analysis—although it highlights several relevant considerations and tradeoffs—does not on its own imply support for any particular policy prescriptions.”).
By the way, there may also be reasons to regard implementing a lottery that would involve going out and killing some random subset of the human population differently from allowing technological progress to continue—even if we were to stipulate that the two cases were exactly parallel with respect to some set of consequentialist outcome metrics. (Also, while in your example, using randomization would equalize people’s chances or ex ante expected lifespans, it would lead to radically uneven ex post outcomes. Some people with egalitarian intuitions care about inequality of outcomes, not only inequality of chances or opportunities—especially in cases where the inequality of outcomes is not connected to personal motivations, efforts, or choices.)
Thank you for your response!
I should have been clearer, yes. I meant that the 97% deals are acceptable to your go/no-go model, not to you or your later models.
However, I think my arguments apply equally to your later models, just with P(doom) different from 97%. (See below.)
Thank you for running the more complicated models. (And, in case unclear, I did read all your article before making any comments.)
Do the models help us understand how we should act? Here is how I look at it --
It’s difficult to get intuition about how good or bad the optimal ASI launch times are, because the envisioned situation is so far from experience.
In each model, there is a remaining P(doom) at the model-optimal launch time, call it R%. R% is sometimes large, sometimes small.
From a perspective that stays entirely within the specific person-affecting framework used by the models, the following deals have equal value
Deal 1: At ASI launch, all humans are killed with R% probability, but life expectancy becomes superhuman with probability 100% - R%.
Deal 2: At ASI launch, R% of humans are killed with certainty, which is the cost of providing superhuman life expectancy for the remaining 100% - R%.
What happens if we put one foot outside model assumptions and begin to care some about future generations? Are lessons of the models robust to a small step away from their assumptions?
If we care about future generations, this tiebreaks. Deal 1 seems worse than Deal 2, since under Deal 1 there may easily be no future generations, while under Deal 2 future generations can repopulate, have happiness, and so on. Even if one believes Deal 1 is not strictly worse than Deal 2, Deal 1 hardly seems much better.
Let’s now return to viewing things however we actually see them, without adopting a specific framework whose rules we restrict ourselves to follow. It’s difficult for me to get an intuitive grasp on Deal 1, but Deal 2 is easier to understand. There are a lots of similar historical precedents. Deal 2 is explicitly killing a large proportion of humans to let the others prosper more. It’s horrible.
We have seen that stepping even slightly outside the model’s person-affecting stance, Deal 1 is worse than Deal 2. Yet, Deal 2 is plainly awful. So what is the value of simulating optimal times for models to accept Deal 1? It is like modeling the ″best″ time to enact horror.
I should mention that the models often find high R% to be optimal. For example, with 50% initial P(doom) and 5%/yr safety progress, the model you used in Table 3 would launch when P(doom) remains 32% with overall life expectancy of 774 years, instead of waiting a handful of decades for P(doom) to fall to near 0. I noticed this when I coded your Appendix A model for the sensitivity analysis in my other comment (which also found a small error in your Table 3 -- see that comment for detail.)
I neglected this, thank you for raising it. I don’t think it undermines my argument, but let me know if you disagree.
Infinite regress issue?
I’m apologize if you addressed this somewhere or if I misunderstand, but is there an infinite regress issue with your models?
Take three times, T0, T1, and T2, each a year apart. Let average life expectancy be 40 years among the living at each time. Suppose the model is run at T0 and it says “Delay ASI launch 3 years”. However, if the model is run again at T1, it will still say “Delay ASI launch 3 years” because the model explicitly only cares about the living. And the same at T2 or later. So one would never reach the time to launch ASI.