I mention that vulnerability further down. Obviously it doesn’t fit human decision making either, but I think it’s qualitatively closer.
An example of an algorithm that’s closer to the desired behavior would be to sample n counterfactuals from your probability distribution. Then take the average of these n outcomes, and take the median of this entire setup. E.g. so 50% of the time the average of the n outcomes is higher, and 50% of the time it’s lower.
As n approaches infinity it becomes equivalent to expected utility, and as it approaches 1 it becomes median expected utility. A reasonable value is probably a few hundred. So that you select outcomes where you come out ahead the vast majority of the time, but still take low probability risks or ignore low probability rewards.
I mention that vulnerability further down. Obviously it doesn’t fit human decision making either, but I think it’s qualitatively closer.
An example of an algorithm that’s closer to the desired behavior would be to sample n counterfactuals from your probability distribution. Then take the average of these n outcomes, and take the median of this entire setup. E.g. so 50% of the time the average of the n outcomes is higher, and 50% of the time it’s lower.
As n approaches infinity it becomes equivalent to expected utility, and as it approaches 1 it becomes median expected utility. A reasonable value is probably a few hundred. So that you select outcomes where you come out ahead the vast majority of the time, but still take low probability risks or ignore low probability rewards.