Consistency is the opposite of humility. Instead of saying “sometimes, I don’t and can’t know”, it says “I will definitively answer any question (of the correct form)”.
Let’s assume that there is some consistent utility function that we’re using as a basis for comparison. This could be the “correct” utility function (eg, God’s); it could be a given individual’s extrapolated consistent utility; or it could be some well-defined function of many people’s utility.
So, given that we’ve assumed that this function exists, obviously if there’s a quasi-omnipotent agent rationally maximizing it, it will be maximized. This outcome will be at least as good as if the agent is “humble”, with a weakly-ordered objective function; and, in many cases, it will be better. So, you’re right, under this metric, the best utility function is equal-or-better to any humble objective.
But if you get the utility function wrong, it could be much worse than a humble objective. For instance, consider adding some small amount of Gaussian noise to the utility. The probability that the “optimized” outcome will have a utility arbitrarily close to the lower bound could, depending on various things, be arbitrarily high; while I think you can argue that a “humble” deus ex machina, by allowing other agents to have more power to choose between world-states over which the machina has no strict preference, would be less likely to end up in such an arbitrarily bad “Goodhart” outcome.
This response is a bit sketchy, but does it answer your question?
Consistency is the opposite of humility. Instead of saying “sometimes, I don’t and can’t know”, it says “I will definitively answer any question (of the correct form)”.
Let’s assume that there is some consistent utility function that we’re using as a basis for comparison. This could be the “correct” utility function (eg, God’s); it could be a given individual’s extrapolated consistent utility; or it could be some well-defined function of many people’s utility.
So, given that we’ve assumed that this function exists, obviously if there’s a quasi-omnipotent agent rationally maximizing it, it will be maximized. This outcome will be at least as good as if the agent is “humble”, with a weakly-ordered objective function; and, in many cases, it will be better. So, you’re right, under this metric, the best utility function is equal-or-better to any humble objective.
But if you get the utility function wrong, it could be much worse than a humble objective. For instance, consider adding some small amount of Gaussian noise to the utility. The probability that the “optimized” outcome will have a utility arbitrarily close to the lower bound could, depending on various things, be arbitrarily high; while I think you can argue that a “humble” deus ex machina, by allowing other agents to have more power to choose between world-states over which the machina has no strict preference, would be less likely to end up in such an arbitrarily bad “Goodhart” outcome.
This response is a bit sketchy, but does it answer your question?
It makes sense to value other agents having power, but are you sure that value can’t be encoded consistently?