Instead, we argue that we need a solution for preserving humanity and improving the future despite not having an easy solution of allowing gradual disempowerment coupled with single-objective beneficial AI...
The first question, one that is central to some discussions of long-term AI risk, is how can humanity stay in control after creating smarter-than-human AI?
But given the question, the answer is overdetermined. We don’t stay in control, certainly not indefinitely. If we build smarter than human AI, which is certainly not a good idea right now, at best we must figure out how we are ceding control. If nothing else, power-seeking AI will be a default, and will be disempowering—even if it’s not directly an existential threat. Even if we solve the problem of treachery robustly, and build an infantilizing vision of superintelligent personal assistants, over long enough time scales, it’s implausible that we not only build that race of more intelligent systems, but do not then cede any power. (And if we did, somehow, the implications of keeping systems that are increasingly intelligent in permanent bondage seems at best morally dubious.)
So, if we (implausibly) happen to be in a world of alignment-by-default, or (even more implausibly) find a solution to intent alignment and agree to create a super-nanny for humanity, what world would we want? Perhaps we use this power to collectively evolve past humanity—or perhaps the visions of pushing for transhumanism before ASI to allow someone, some group to stay in control are realized. Either way, what then for the humans?
agreed that among all paths to good things that I see, a common thread is somehow uplifting human cognition to keep pace with advanced AI. however, I doubt that that’s even close to good enough—human cooperation is shaky and unreliable. most humans who think they’d do good things if made superintelligent probably are wrong due to various ways to value drift when the structure of one’s cognition changes, and many humans who say they think they’d do good things are simply lying, rather than deluding themselves or overestimating their own durable-goodness. it seems to me that in order to make this happen, we need to make AIs that strongly want all humans and humanity and etc emergent groups to stick around, the way a language model wants to output text.
Yes, wanting humans to stick around is probably a minimal necessity, but it’s still probably not enough—as the post explains.
And I simply don’t think you can “make humans smarter” in ways that don’t require, or at least clearly risk, erasing things that make us fundamentally human as we understand it today.
I’ll point to a similarly pessimistic but divergent view on how to mange the likely bad transition to an AI future that I co-authored recently;
agreed that among all paths to good things that I see, a common thread is somehow uplifting human cognition to keep pace with advanced AI. however, I doubt that that’s even close to good enough—human cooperation is shaky and unreliable. most humans who think they’d do good things if made superintelligent probably are wrong due to various ways to value drift when the structure of one’s cognition changes, and many humans who say they think they’d do good things are simply lying, rather than deluding themselves or overestimating their own durable-goodness. it seems to me that in order to make this happen, we need to make AIs that strongly want all humans and humanity and etc emergent groups to stick around, the way a language model wants to output text.
Yes, wanting humans to stick around is probably a minimal necessity, but it’s still probably not enough—as the post explains.
And I simply don’t think you can “make humans smarter” in ways that don’t require, or at least clearly risk, erasing things that make us fundamentally human as we understand it today.