Steven Byrnes comments on a confusion about preference orderings

Steven Byrnes 12 May 2025 0:57 UTC
8 points
1
I think your setup is weird for the context where people talk about utility functions, circular preferences, and money-pumping. “Having a utility function” is trivial unless the input to the utility function is something like “the state of the world at a certain time in the future”. So in that context, I think we should be imagining something like this:
And the stereotypical circular preference would be between “I want the world to be in State {A,B,C} at a particular future time T”.
I think you’re mixing up an MDP RL scenario with a consequentialist planning scenario? MDP RL agents can make decisions based on steering towards particular future states, but they don’t have to, and they often don’t, especially if the discount rate is high.
“AI agents that care about the state of the world in the future, and take actions accordingly, with great skill” are a very important category of AI to talk about because (1) such agents are super dangerous because of instrumental convergence, (2) people will probably make such agents anyway, because people care about the state of the world in the future, and also because such AIs are very powerful and impressive etc.