I’m not familiar with anything like this kind of freedom discussed for an AGI. I think it would have to be designed in, or at least one would have to ensure that the AI does not evolve into a strong optimization process (i.e. a mesa-optimizer) in order to start to explore something like this.
On the human side, like your example about people wanting to change some of their desires. I have seen that explored some, e.g. by Stuart Armstrong (searching this post for “meta-value” will turn up some of the discussion on it). This is in the context of an AI being able to thoroughly human values, that it would need to account for meta-values like this.
I’m not familiar with anything like this kind of freedom discussed for an AGI. I think it would have to be designed in, or at least one would have to ensure that the AI does not evolve into a strong optimization process (i.e. a mesa-optimizer) in order to start to explore something like this.
On the human side, like your example about people wanting to change some of their desires. I have seen that explored some, e.g. by Stuart Armstrong (searching this post for “meta-value” will turn up some of the discussion on it). This is in the context of an AI being able to thoroughly human values, that it would need to account for meta-values like this.