Should we postpone AGI until we reach safety?

Should we postpone AGI, until its risks have fallen below a certain level, thereby applying the precautionary principle? And if so, would setting up policy be a promising way to achieve this?

As I argued here in the comments, I think calling for precautionary principle-policy, notably towards political decision makers, would be a good idea. I’ve had a great LW and telephone discussion with Daniel Kokotajlo about this, who disagrees, with the arguments below. I think it is valuable to make our lines of reasoning explicit and sharpen them by discussion, which is why I’m promoting them to a post.

Assuming AGI in a relevant time span, there are two ways in which humanity can decrease x-risk to acceptable levels: 1) AI alignment, consisting both of technical AI safety and reasonable values alignment, and 2) AGI postponement until 1 can be achieved with sufficient existential safety (this may be anywhere between soon and never).

Since it is unsure whether we can achieve 1 at all, and also whether we can achieve it in time assuming we can achieve it in principle, we should aim to achieve 2 as well. The main reason is that this could lead to a significant reduction of total existential risk if successful, and we don’t know much about how hard it is, so it could well be worthwhile. Companies and academics are both not incentivized to postpone limited AGI development in accordance with the precautionary principle. Therefore, I think we need a different body calling for this, and I think states make sense. As a positive side effect, pressure from possible postponement on companies and academia would incentivize them to invest significantly more in alignment, thereby again reducing existential risk.

Daniel disagreed with me mostly because of three reasons. I obtained two more counterarguments from the discussions below and elsewhere. The following so far appears to be a reasonably complete list of counterarguments.

  1. Calling for AGI postponement until safety has been achieved, he thinks, would alienate the AI community from the AI safety community and that would hurt the chances of achieving AI safety.

  2. It would be rather difficult (or extremely difficult, or impossible) to get useful regulations passed, because influencing governments is generally hard and influencing them in the right ways is even harder.

  3. Restricting AGI research while allowing computer hardware progress, other AI progress, etc. to continue should mean that we are making the eventual takeoff faster, by increasing hardware overhang and other kinds of overhang. A faster takeoff is probably more dangerous.

  4. Efficiency: we have a certain amount of labor available for reducing existential risk. Working on AI safety is more efficient than AGI postponement, so let’s focus on AI safety and discard AI postponement.

  5. If we manage to delay only the most safety-abiding parties but not the others, we could hurt safety instead of helping it.

On the first argument, I replied that I think a non-AGI safety group could do this, and therefore not hurt the principally unrelated AGI safety efforts. Such a group could even call for reduction of existential risk in general, further decoupling the two efforts. Also, even if there would be a small adverse effect, I think it would be outweighed by the positive effect of incentivizing corporations and academia into funding more AI safety (since this is now also stimulated by regulation). Daniel said that if this would really be true, which we could establish for example by researching respondents’ behaviour, that could change his mind.

I agree with the second counterargument, but if the gain would be large assuming success (which I think is true), and the effort uncertain (which I think is also true), I think exploring the option makes sense.

The third counterargument would be less relevant if we are currently already heading for a fast take-off (which I really don’t know). I think this argument requires more thought.

For the fourth counterargument, I would say that the vast majority of people is unable to contribute meaningfully to AI safety research. Of course all these people could theoretically do whatever makes most money and then donate to AI safety research. But most will not do that in practice. I think many of these people could be used for the much more generic task of arguing for postponement.

Finally regarding the fifth counterargument, I would say we should argue for a global postponement treaty, or perhaps a treaty among the most likely parties (e.g. nation states). If all parties can be affected equally, this argument would loose its weight.

I’m curious about other’s opinions on the matter. Do you also think postponing AGI until we reach safety would be a good idea? How could this be achieved? If you disagree, could you explicitly point out which part of the reasoning you agree with (if any), and where your opinion differs?