I really agree: about timescales, about the risks of misalignment, about the risks of alignment. In fact I think I’ll go further and say that in a hypothetical world where an aligned AGI is controlled by a 99th percentile Awesome Human Being, it’ll still end in disaster; homo sapiens just isn’t capable of handling this kind of power.[1]
That’s why the only kind of alignment I’m interested in is the kind that results in the AGI in control; that we ‘align’ an AGI with some minimum values that anchor it in a vaguely anthropocentric meme-space (e.g. paperclips boring, unicorns exciting) and ensures some kind of attachment/bond to us (e.g. like how babies/dogs get their hooks into us) and then just let it fly; GPT-Jesus take the wheel.
(So yes, the Minds from the Culture)
No other solution works. No other path has a happy ending.[2]
This is why I support alignment research—I don’t believe the odds are good, I don’t believe the odds are good even if they solve the technical problem, but I don’t see a future in which homo sapiens flourishes without benevolent GPT-Jesus watching over us.
Because the human alignment problem you correctly identify as the root of our wider problems—that isn’t going away by itself.
Not a ‘power corrupts’ argument, just stating the obvious: godlike power directed by monkeylike intelligence doesn’t end well, no matter how awesome the individual monkey.
Maaaaaybe genetic engineering; if we somehow figured out how to create Homo Sapiens 2.0, and they figured out 3.0 etc etc.
This pathway has a greater margin for error, and far fewer dead ends where we accidentally destroy everything. It can go slow, we can do it incrementally, we can try multiple approaches in parralel; we can have redundancy, backups etc.
I think if we could somehow nuke AI capabilities, this path would be preferable. But as it is AI capabilities is going to get to the finish line before genetics has even left the lab.
Maybe that’s the biggest difference between me and a lot of people here. You want to maximize the chance of a happy ending. I don’t think a happy ending is coming. This world is horrible and the game is rigged. Most people don’t even want the happy ending you or I would want, at least not for anybody other then themselves, their families, and maybe their nation.
I’m more concerned with making sure the worst of the possibilities never come to pass. If that’s the contribution humanity ends up making to this world, it’s a better contribution than I would have expected anyway.
To start with, I agree.
I really agree: about timescales, about the risks of misalignment, about the risks of alignment. In fact I think I’ll go further and say that in a hypothetical world where an aligned AGI is controlled by a 99th percentile Awesome Human Being, it’ll still end in disaster; homo sapiens just isn’t capable of handling this kind of power.[1]
That’s why the only kind of alignment I’m interested in is the kind that results in the AGI in control; that we ‘align’ an AGI with some minimum values that anchor it in a vaguely anthropocentric meme-space (e.g. paperclips boring, unicorns exciting) and ensures some kind of attachment/bond to us (e.g. like how babies/dogs get their hooks into us) and then just let it fly; GPT-Jesus take the wheel.
(So yes, the Minds from the Culture)
No other solution works. No other path has a happy ending.[2]
This is why I support alignment research—I don’t believe the odds are good, I don’t believe the odds are good even if they solve the technical problem, but I don’t see a future in which homo sapiens flourishes without benevolent GPT-Jesus watching over us.
Because the human alignment problem you correctly identify as the root of our wider problems—that isn’t going away by itself.
Not a ‘power corrupts’ argument, just stating the obvious: godlike power directed by monkeylike intelligence doesn’t end well, no matter how awesome the individual monkey.
Maaaaaybe genetic engineering; if we somehow figured out how to create Homo Sapiens 2.0, and they figured out 3.0 etc etc.
This pathway has a greater margin for error, and far fewer dead ends where we accidentally destroy everything. It can go slow, we can do it incrementally, we can try multiple approaches in parralel; we can have redundancy, backups etc.
I think if we could somehow nuke AI capabilities, this path would be preferable. But as it is AI capabilities is going to get to the finish line before genetics has even left the lab.
Maybe that’s the biggest difference between me and a lot of people here. You want to maximize the chance of a happy ending. I don’t think a happy ending is coming. This world is horrible and the game is rigged. Most people don’t even want the happy ending you or I would want, at least not for anybody other then themselves, their families, and maybe their nation.
I’m more concerned with making sure the worst of the possibilities never come to pass. If that’s the contribution humanity ends up making to this world, it’s a better contribution than I would have expected anyway.