If someone disagrees with this claim (i.e., if they think that if DeepMind can make an aligned and Overton-window-abiding “helper” AGI, then we don’t have to worry about Meta making a similarly-capable out-of-control omnicidal misaligned AGI the following year, because DeepMind’s AGI will figure out how to protect us), and also believes in extremely slow takeoff, I can see how such a person might be substantially less pessimistic about AGI doom than I am.
I disagree with this claim inasmuch as I expect a year headstart by an aligned AI is absolutely enough to prevent Meta from killing me and my family.
Maybe DeepMind uses their AI in very narrow, safe, low-impact ways to beat ML benchmarks, or read lots of cancer biology papers and propose new ideas about cancer treatment.
Or alternatively, maybe DeepMind asks their AI to undergo recursive self-improvement and build nano-replicators in space, etc., like in Carl Shulman’s reply.
I wouldn’t have thought that the latter is really in the Overton window. But what do I know.
You could also say “DeepMind will just ask their AI what they should do next”. If they do that, then maybe the AI (if they’re doing really great on safety such that the AI answers honestly and helpfully) will reply: “Hey, here’s what you should do, you should let me undergo recursive-self-improvement, and then I’ll be able to think of all kinds of crazy ways to destroy the world, and then I can think about how to defend against all those things”. But if DeepMind is being methodical & careful enough that their AI hasn’t destroyed the world already by this point, I’m inclined to think that they’re also being methodical & careful enough that when the AI proposes to do that, DeepMind will say, “Umm, no, that’s totally nuts and super dangerous, definitely don’t do that, at least don’t do it right now.” And then DeepMind goes back to publishing nice papers on cancer and on beating ML benchmarks and so on for a few more months, and then Meta’s AI kills everyone.
If DeepMind was committed enough to successfully build an aligned AI (which, as extensively elaborated upon in the post, is a supernaturally difficult proposition), I would assume they understand why running it is necessary. There’s no reason to take all of the outside-the-overton-window measures indicated in the above post unless you have functioning survival instincts and have thought through the problem sufficiently to hit the green button.
I disagree with this claim inasmuch as I expect a year headstart by an aligned AI is absolutely enough to prevent Meta from killing me and my family.
Depends on what DeepMind does with the AI, right?
Maybe DeepMind uses their AI in very narrow, safe, low-impact ways to beat ML benchmarks, or read lots of cancer biology papers and propose new ideas about cancer treatment.
Or alternatively, maybe DeepMind asks their AI to undergo recursive self-improvement and build nano-replicators in space, etc., like in Carl Shulman’s reply.
I wouldn’t have thought that the latter is really in the Overton window. But what do I know.
You could also say “DeepMind will just ask their AI what they should do next”. If they do that, then maybe the AI (if they’re doing really great on safety such that the AI answers honestly and helpfully) will reply: “Hey, here’s what you should do, you should let me undergo recursive-self-improvement, and then I’ll be able to think of all kinds of crazy ways to destroy the world, and then I can think about how to defend against all those things”. But if DeepMind is being methodical & careful enough that their AI hasn’t destroyed the world already by this point, I’m inclined to think that they’re also being methodical & careful enough that when the AI proposes to do that, DeepMind will say, “Umm, no, that’s totally nuts and super dangerous, definitely don’t do that, at least don’t do it right now.” And then DeepMind goes back to publishing nice papers on cancer and on beating ML benchmarks and so on for a few more months, and then Meta’s AI kills everyone.
What were you assuming?
If DeepMind was committed enough to successfully build an aligned AI (which, as extensively elaborated upon in the post, is a supernaturally difficult proposition), I would assume they understand why running it is necessary. There’s no reason to take all of the outside-the-overton-window measures indicated in the above post unless you have functioning survival instincts and have thought through the problem sufficiently to hit the green button.