First, yes, alleles play zero-sum games against each other. So what? At this moment, some alleles are winning and some alleles are losing; some of them have higher percentage in the gene pool, some have lower. So, why not take the snapshot of what the gene pool is now, and define the human values accordingly. (And iterate. If the current values prefer to have some alleles removed, then remove them, and define the human values according to the remaining ones.)
Second, yes, human behavior and preferences change depending on the environment. For example, when humans are hungry, they may prefer to eat even when it means killing another person; but when they are not hungry, they may prefer to live in peace with other humans. So which one is the true preference? Well, humans also have preferences about environment, e.g. they prefer not being hungry to being hungry. So we can try to iterate these values and environments together, and find out that the attractor is the situation where humans are not hungry and live in peace with other humans. (Technically, there could be multiple attractors.)
The facts that alleles are playing zero-sum games, and that human preferences depend on the environment, do not per se make the task of finding human “extrapolated volition” impossible.
Seems to me we have a few logical jumps here.
First, yes, alleles play zero-sum games against each other. So what? At this moment, some alleles are winning and some alleles are losing; some of them have higher percentage in the gene pool, some have lower. So, why not take the snapshot of what the gene pool is now, and define the human values accordingly. (And iterate. If the current values prefer to have some alleles removed, then remove them, and define the human values according to the remaining ones.)
Second, yes, human behavior and preferences change depending on the environment. For example, when humans are hungry, they may prefer to eat even when it means killing another person; but when they are not hungry, they may prefer to live in peace with other humans. So which one is the true preference? Well, humans also have preferences about environment, e.g. they prefer not being hungry to being hungry. So we can try to iterate these values and environments together, and find out that the attractor is the situation where humans are not hungry and live in peace with other humans. (Technically, there could be multiple attractors.)
The facts that alleles are playing zero-sum games, and that human preferences depend on the environment, do not per se make the task of finding human “extrapolated volition” impossible.