Most of your argument is about selection pressure, right? And, like, computational efficiency. You don’t actually establish that there’s any reason that AI’s (or humans) will take the artifact-nature of their values to be reason to reject them. Your supported claims are that values would be rejected if they are not robust to ontology shifts, or if they are hard to optimize for, and are selected against if they don’t result in self-replication or influence seeking. Nothing in there about AIs rejecting values with artifact-nature. But you include this line anyway. I’m just pointing out that EY will instantly recognize it as something that he’s addressed many times before, and you haven’t actually provided any reason to think that reasoners will reject values simply because they incidentally arose from some optimization process.
EDIT: Disagree voters should feel free to reply with quotes from the post where such a force on values is argued for.
I am puzzled at the fact that you are staying the position I spend an essay attacking as if it were a gotchs
Most of your argument is about selection pressure, right? And, like, computational efficiency. You don’t actually establish that there’s any reason that AI’s (or humans) will take the artifact-nature of their values to be reason to reject them. Your supported claims are that values would be rejected if they are not robust to ontology shifts, or if they are hard to optimize for, and are selected against if they don’t result in self-replication or influence seeking. Nothing in there about AIs rejecting values with artifact-nature. But you include this line anyway. I’m just pointing out that EY will instantly recognize it as something that he’s addressed many times before, and you haven’t actually provided any reason to think that reasoners will reject values simply because they incidentally arose from some optimization process.
EDIT: Disagree voters should feel free to reply with quotes from the post where such a force on values is argued for.