Mitchell_Porter comments on ASI existential risk: Reconsidering Alignment as a Goal

Mitchell_Porter 16 Apr 2025 7:19 UTC
11 points
6
I knew the author (Michael Nielsen) once but didn’t stay in touch… I had a little trouble figuring out what he actually advocates here, e.g. at the end he talks about increasing “the supply of safety”, and lists “differential technological development” (Bostrom), “d/acc” (Buterik), and “coceleration” (Nielsen) as “ongoing efforts” that share this aim, without defining any of them. But following his links, I would define those in turn as “slowing down dangerous things, and speeding up beneficial things”; “focusing on decentralization and individual defense”; and “advancing safety as well as advancing capabilities”.
In this particular essay, his position seems similar to contemporary MIRI. MIRI gave up on alignment in favor of just stopping the stampede towards AI, and here Michael is also saying that people who care about AI safety should work on topics other than alignment (e.g. “institutions, norms, laws, and education”), because (my paraphrase) alignment work is just adding fuel to the fire of advances in AI.
Well, let’s remind ourselves of the current situation. There are two AI powers in the world, America and China (and plenty of other nations who would gladly join them in that status). Both of them are hosting a capabilities race in which multiple billion-dollar companies compete to advance AI, and “making the AI too smart” is not something that either side cares about. We are in a no-brakes race towards superintelligence, and alignment research is the only organized effort aimed at making the outcome human-friendly.
I think plain speaking is important at this late stage, so let me also try to be as clear as possible about how I see our prospects.
First, the creation of superintelligence will mean that humanity is no longer in control, unless human beings are somehow embedded in it. Superintelligence may or may not coexist with us, I don’t know the odds of it emerging in a human-friendly form; but it will have the upper hand, we will be at its mercy. If we don’t intend to just gamble on there being a positive outcome, we need alignment research. For that matter, if we really didn’t want to gamble, we wouldn’t create superintelligence until we had alignment theory perfectly worked out. But we don’t live in that timeline.
Second, although we are not giving ourselves time to solve alignment safely, that still has a chance of happening, if rising capabilities are harnessed to do alignment research. If we had no AI, maybe alignment theory would take 20 or 50 years to solve, but with AI, years of progress can happen in months or weeks. I don’t know the odds of alignment getting fully solved in that way, but the ingredients are there for it to happen.
I feel I should say something on the prospect of a global pause or a halt occurring. I would call it unlikely but not impossible. It looks unlikely because we are in a decentralized no-holds-barred race towards superintelligence already, and the most advanced AIs are looking pretty capable (despite some gaps e.g. 1 2), and there’s no serious counterforce on the political scene. It’s not impossible because change, even massive change, does happen in politics and geopolitics, and there’s only a finite number of contenders in the race (though that number grows every year).