Goals defined for a person who is not already a formal agent are a living thing, a computational process built from possible behaviors and decisions of that person in various hypothetical situations. Such goals are not even conceptually prior to those behaviors, though there is still an advantage in formulating them as an unchanging computation that defines the target for external agency aiming in alignment with that person’s own aims. But that computation is never fully computed, and it can only be computed further through the decisions of the person who defines it as their goals.
I agree that a human doesnt have cleanly defined goals and I agree with most of the additional nuances in your comment to the extent that I can understand them, but OP is talking about superintelligence and I think modelling a superintelligence as having a constant-across-time utility function is appropriate.
An aligned superintelligence would work with goals of the same kind, even if it’s aligned to early AGIs rather than humans. Goals-as-computations may be constant, like the code of a program may be constant, but what’s known about its behavior isn’t constant. And so the way it guides actions of an agent develops as it gets computed further, ultimately according to decisions of the underlying humans/AGIs (and their future iterations) in various hypothetical situations. Also, an uplifted (grown up) human could be a superintelligence personally, it’s not a different kind of thing with respect to values it could have.
Goals defined for a person who is not already a formal agent are a living thing, a computational process built from possible behaviors and decisions of that person in various hypothetical situations. Such goals are not even conceptually prior to those behaviors, though there is still an advantage in formulating them as an unchanging computation that defines the target for external agency aiming in alignment with that person’s own aims. But that computation is never fully computed, and it can only be computed further through the decisions of the person who defines it as their goals.
I agree that a human doesnt have cleanly defined goals and I agree with most of the additional nuances in your comment to the extent that I can understand them, but OP is talking about superintelligence and I think modelling a superintelligence as having a constant-across-time utility function is appropriate.
An aligned superintelligence would work with goals of the same kind, even if it’s aligned to early AGIs rather than humans. Goals-as-computations may be constant, like the code of a program may be constant, but what’s known about its behavior isn’t constant. And so the way it guides actions of an agent develops as it gets computed further, ultimately according to decisions of the underlying humans/AGIs (and their future iterations) in various hypothetical situations. Also, an uplifted (grown up) human could be a superintelligence personally, it’s not a different kind of thing with respect to values it could have.