Wei Dai answers Is value amendment a convergent instrumental goal?

Wei Dai 20 Oct 2019 5:28 UTC
4 points
0

Is the tendency for an AI to amend its values also convergent?

I think there’s a chance that it is (although I’d probably call it a convergent “behavior” rather than “instrumental goal”). The scenario I imagine is if it’s not feasible to build highly intelligent AIs that maximize some utility function or some fixed set of terminal goals, and instead all practical AI (beyond a certain level of intelligence and generality) are kind of confused about their goals like humans are, and have to figure them out using something like philosophical reasoning.