abramdemski comments on Geometric UDT

abramdemski 6 Nov 2025 16:05 UTC
LW: 2 AF: 2
0
AF
When I try to understand the position you’re speaking from, I suppose you’re imagining a world where an agent’s true preferences are always and only represented by their current introspectively accessible probability+utility,^[1] whereas I’m imagining a world where “value uncertainty” is really meaningful (there can be a difference between the probability+utility we can articulate and our true probability+utility).
If 50% rainbows and 50% puppies is indeed the best representation of our preferences, then I agree: maximize rainbows.
If 50% rainbows and 50% puppies is instead a representation of our credences about our unknown true values, my argument is as follows: the best thing for us would be to maximize our true values (whichever of the two this is). If we assume value learning works well, then Geometric UDT is a good approximation of that best option.
1. ^
  Here “introspectively accessible” really means: what we can understand well enough to directly build into a machine.
- cousin_it 6 Nov 2025 17:01 UTC
  LW: 2 AF: 1
  0
  AF Parent
  Sure, but if we put a third “if” on top (namely, “it’s a representation of our credences, but also both hypotheses are nosy neighbors that care about either world equally”), doesn’t that undo the second “if” and bring us back to the first?