I appreciate this post (still, two years later). It draws into plain view the argument: “If extreme optimization for anything except one’s own exact values causes a very bad world, humans other than oneself getting power should be scary in roughly the same way as a papperclipper getting power should be scary.” I find it helpful to have this argument in plainer view, and to contemplate together whether the reply is something like:
Yes
Yes, but much less so because value isn’t that fragile
No, because human values aren’t made of “take some utility function and subject it to extreme optimization,” but of something else, e.g. looking for places where many different thingies converge, as with convergent instrumental utility (my own guess is something in this vague vicinity, which also gives me somewhat more hope that I might like some things about what autonomous AIs build if they go Foom)
I appreciate this post (still, two years later). It draws into plain view the argument: “If extreme optimization for anything except one’s own exact values causes a very bad world, humans other than oneself getting power should be scary in roughly the same way as a papperclipper getting power should be scary.” I find it helpful to have this argument in plainer view, and to contemplate together whether the reply is something like:
Yes
Yes, but much less so because value isn’t that fragile
No, because human values aren’t made of “take some utility function and subject it to extreme optimization,” but of something else, e.g. looking for places where many different thingies converge, as with convergent instrumental utility (my own guess is something in this vague vicinity, which also gives me somewhat more hope that I might like some things about what autonomous AIs build if they go Foom)
...?