Roman Malov comments on Geometric UDT

Roman Malov 13 Nov 2025 14:56 UTC
10 points
0
And also, if we have two hypotheses, $H_{1}$ and $H_{2}$ , and policy $π$ has a much lower expected value compared to BATNA, such that both terms in the product are negative, then the total product is positive (and large), and argmax is going to choose this policy (which is strictly worse than BATNA).
But I guess both of those issues can be easily assumed away.