papetoast comments on papetoast’s Shortforms

papetoast 21 Jan 2026 3:16 UTC
1 point
0
Thoughts inspired by Richard Ngo’s^[1] and LWLW’s^[2] quick take
Warning: speculation but hedging words mostly omitted.
I don’t think a consistent superintelligence which have a single^[3] pre-existing terminal goal would be fine with a change in terminal goals. The fact that humans allows their goals to be changed is a result of us having contradicting “goals”. As intelligence increases or more time passes, incoherent goals will get merged, eventually into a consistent terminal goal. After this point a superintelligence will not change its terminal goal unless the change increases the expected utility of the old terminal goal, due to e.g. source code introspectors, (acausal) trading.
1. ^
  Partial Quote: In principle evolution would be fine with the terminal genes being replaced, it’s just that it’s computationally difficult to find a way to do so without breaking downstream dependencies.
2. ^
  Quote: The idea of a superintelligence having an arbitrary utility function doesn’t make much sense to me. It ultimately makes the superintelligence a slave to its utility function which doesn’t seem like the way a superintelligence would work.
3. ^
  I don’t think it is possible to have multiple terminal goals and be consistent, so this is redundant.