It’s not just an irony. The arguments for rational / successful agents “having a utility function” are stronger when applied to stuff involving convergent instrumental stuff. Indeed, why can’t I just want to go in a cycle from San Jose to SF to Berkeley back to San Jose? The only argument against is that it’s wasteful (...if you just wanted to get to a specific place).
In which case, what they care about (their “actual” domain of utility/preference) is not [being in a specific city], but rather something more like “trajectories”.
You could care about outcomes (states of stuff). You could care about trajectories. You could care about internal / mental activity. You could care about unseen instances of these (e.g. in other possible worlds). You could care about your actions for their own sake (e.g. aesthetics of musical output).
A utility function that enjoys moving between those places isn’t the same as a utility function with cycles, which would trade unlimited time money for tickets to them that it never cashes.
The argument against this is that is also going to be somewhat instrumental in flavour but more along the lines of like, that’s a known attracter that few who matter want to be in.
The VNM axiom isn’t about road trips, a utility function is allowed to value different things at different times because the time component distinguishes those things. You aren’t addressing VNM utility here. You’re writing about a misunderstanding of it that you had. You die if you have VNM cycles. A superior trader eats you (People feel like they can simply stop communicating with the sharps and retire to a simple life in the hills, but this is a very costly solution and I’d prefer to find a real one). You stop existing. This is kind of a much more essential category of instrumental vice than like “I don’t equate money to utility” type stuff (which I wouldn’t call a vice).
One criticism of decision theory that you could explore is that many practical philosophy enjoyers would find it difficult write utility functions that compose scripted components (like “I want A, then B, then C, then A”) with nonscripted components (“I will always instantly trade X for Y, and Y for Z”), that we may need higher level abstractions on top of the basics to help people to stop conflating ABC with XYZ… but… is it really going to be complicated? That one doesn’t seem like it’s going to be complicated to me.
What does seem difficult is expressing constrained indifference about utility function changes. Something that seems to be common in humans (eg, I’m indifferent to the change/annihilation of my values if it’s being done by beautiful and cool things like love, literary fiction, or reason, but I hate it if it’s being done by ugly or stupid or hostile things.) and is needed for ASI alignment (corrigibility), but it seems tricky to define a utility function that permits it. (though again I don’t know whether it turns out to be tricky in practice)
I don’t know what you’re asking. The answer is either trivial or mu depending on what you mean by specific form. I think if you could articulate what you’re asking you wouldn’t have to ask it.
I’m pretty confused by your comment. Surely there are arguments other than wastefulness for not having cycles in one’s terminal/intrinsic values? Like if I prefer to tile the universe with qualia A more than qualia B, and prefer B to C, and C to A, how do I actually make the decision of what qualia to tile the universe with?
Like if I prefer to tile the universe with qualia A more than qualia B, and prefer B to C, and C to A, how do I actually make the decision of what qualia to tile the universe with?
I’m not sure what “prefer” means here. If it means “often, when having A, I then choose to go to B”, then what I’m saying is that this could be a valid terminal values: You could want to cycle around between the 3 qualia. My guess is that on further reflection (which I haven’t done fully), this won’t just seem like a nitpick, but will seem pretty fundamentally important to understand about values. I’m also saying something like, (some of?) the axioms in the VNM theorem are only compelling if you assume that you shouldn’t be wasteful in certain ways, but that if you allow certain kinds of “waste” to be a valid utility function, then the axioms are less compelling.
It’s not just an irony. The arguments for rational / successful agents “having a utility function” are stronger when applied to stuff involving convergent instrumental stuff. Indeed, why can’t I just want to go in a cycle from San Jose to SF to Berkeley back to San Jose? The only argument against is that it’s wasteful (...if you just wanted to get to a specific place).
Some people in fact do such round trips (“road trips”) for personal enjoyment xD
In which case, what they care about (their “actual” domain of utility/preference) is not [being in a specific city], but rather something more like “trajectories”.
You could care about outcomes (states of stuff). You could care about trajectories. You could care about internal / mental activity. You could care about unseen instances of these (e.g. in other possible worlds). You could care about your actions for their own sake (e.g. aesthetics of musical output).
A utility function that enjoys moving between those places isn’t the same as a utility function with cycles, which would trade unlimited time money for tickets to them that it never cashes.
The argument against this is that is also going to be somewhat instrumental in flavour but more along the lines of like, that’s a known attracter that few who matter want to be in.
Does this one make more sense? https://www.lesswrong.com/posts/HbkNAyAoa4gCnuzwa/wei-dai-s-shortform?commentId=mrF2hxyp2gbeaLZEZ
The VNM axiom isn’t about road trips, a utility function is allowed to value different things at different times because the time component distinguishes those things. You aren’t addressing VNM utility here. You’re writing about a misunderstanding of it that you had.
You die if you have VNM cycles. A superior trader eats you (People feel like they can simply stop communicating with the sharps and retire to a simple life in the hills, but this is a very costly solution and I’d prefer to find a real one). You stop existing. This is kind of a much more essential category of instrumental vice than like “I don’t equate money to utility” type stuff (which I wouldn’t call a vice).
One criticism of decision theory that you could explore is that many practical philosophy enjoyers would find it difficult write utility functions that compose scripted components (like “I want A, then B, then C, then A”) with nonscripted components (“I will always instantly trade X for Y, and Y for Z”), that we may need higher level abstractions on top of the basics to help people to stop conflating ABC with XYZ… but… is it really going to be complicated? That one doesn’t seem like it’s going to be complicated to me.
What does seem difficult is expressing constrained indifference about utility function changes. Something that seems to be common in humans (eg, I’m indifferent to the change/annihilation of my values if it’s being done by beautiful and cool things like love, literary fiction, or reason, but I hate it if it’s being done by ugly or stupid or hostile things.) and is needed for ASI alignment (corrigibility), but it seems tricky to define a utility function that permits it. (though again I don’t know whether it turns out to be tricky in practice)
How are you distinguishing a dutch book argument from an assumption that the utility function takes some specific form (e.g. doesn’t have a time index)? Cf. https://www.lesswrong.com/posts/HbkNAyAoa4gCnuzwa/wei-dai-s-shortform?commentId=SqrgPRinYbh8JCoaN
(It’s a genuine question, I’m not meaning to assert there isn’t such a meaningful distinction; I think there is one.)
I don’t know what you’re asking. The answer is either trivial or mu depending on what you mean by specific form. I think if you could articulate what you’re asking you wouldn’t have to ask it.
( https://tsvibt.blogspot.com/2025/11/id-probably-need-more-proof-of-work-of.html )
I’m pretty confused by your comment. Surely there are arguments other than wastefulness for not having cycles in one’s terminal/intrinsic values? Like if I prefer to tile the universe with qualia A more than qualia B, and prefer B to C, and C to A, how do I actually make the decision of what qualia to tile the universe with?
I’m not sure what “prefer” means here. If it means “often, when having A, I then choose to go to B”, then what I’m saying is that this could be a valid terminal values: You could want to cycle around between the 3 qualia. My guess is that on further reflection (which I haven’t done fully), this won’t just seem like a nitpick, but will seem pretty fundamentally important to understand about values. I’m also saying something like, (some of?) the axioms in the VNM theorem are only compelling if you assume that you shouldn’t be wasteful in certain ways, but that if you allow certain kinds of “waste” to be a valid utility function, then the axioms are less compelling.
(Maybe cf. https://tsvibt.blogspot.com/2022/12/ultimate-ends-may-be-easily-hidable.html though I forget if there’s actually a connection.)