I agree that a human doesnt have cleanly defined goals and I agree with most of the additional nuances in your comment to the extent that I can understand them, but OP is talking about superintelligence and I think modelling a superintelligence as having a constant-across-time utility function is appropriate.
An aligned superintelligence would work with goals of the same kind, even if it’s aligned to early AGIs rather than humans. Goals-as-computations may be constant, like the code of a program may be constant, but what’s known about its behavior isn’t constant. And so the way it guides actions of an agent develops as it gets computed further, ultimately according to decisions of the underlying humans/AGIs (and their future iterations) in various hypothetical situations. Also, an uplifted (grown up) human could be a superintelligence personally, it’s not a different kind of thing with respect to values it could have.
I agree that a human doesnt have cleanly defined goals and I agree with most of the additional nuances in your comment to the extent that I can understand them, but OP is talking about superintelligence and I think modelling a superintelligence as having a constant-across-time utility function is appropriate.
An aligned superintelligence would work with goals of the same kind, even if it’s aligned to early AGIs rather than humans. Goals-as-computations may be constant, like the code of a program may be constant, but what’s known about its behavior isn’t constant. And so the way it guides actions of an agent develops as it gets computed further, ultimately according to decisions of the underlying humans/AGIs (and their future iterations) in various hypothetical situations. Also, an uplifted (grown up) human could be a superintelligence personally, it’s not a different kind of thing with respect to values it could have.