Very probably not. I’m claiming that the desire to code it in would be convergent, ’cuz it’s the best way to do AI even if you think you’re just trying to maximize paperclips. Of course, most AGI researchers aren’t that clever, so again, we still need to raise awareness about AGI dangers. I’m just floating a contrarian hypothesis that seems somewhat neglected.
I’m claiming that the desire to code it in would be convergent, ’cuz it’s the best way to do AI even if you think you’re just trying to maximize paperclips.
Very probably not. I’m claiming that the desire to code it in would be convergent, ’cuz it’s the best way to do AI even if you think you’re just trying to maximize paperclips. Of course, most AGI researchers aren’t that clever, so again, we still need to raise awareness about AGI dangers. I’m just floating a contrarian hypothesis that seems somewhat neglected.
But it’s a lot harder to code that than to code “maximize paperclips”.
Then you should have said that!
That sounds exactly like CEV.
I think a closer match is the “shaper-anchor semantics” from Eliezer’s “Creating Friendly AI”.