Ah, I see what you mean, and you’re right; the utility function constructed will depend on how the data points are sampled. This isn’t quite the same as saying that the result will depend on what results are actually available, though, unless knowledge about what results will be available is used to determine how to sample the data. This still seems like somewhat of a defect of inverse reinforcement learning, unless there ends up being a good case that some particular way of sampling the data is optimal for revealing underlying preferences and ignoring biases, or something like that.
Given that human desires are not independent (citation not needed)
That’s probably true, but on the other hand, you seem to want to pin the deviations of human behavior from VNM rationality on violations of the independence axiom, and it isn’t clear to me that this is the case (I don’t think the point you were making relies on this, so if you weren’t trying to make that claim then you can ignore this; it just seemed like you might be). There are situations where there are large framing effects (that is, whether A or B is preferred depends on how the options are presented, even if no other outcome C is being mixed in with them), and likely also violations of transitivity (where someone would say A>B, B>C, and C>A whenever you ask them about 2 of them without bringing up the third). It seems likely to me that most paradoxes of human decision-making have more to do with these than they do to violations of independence.
Ah, I see what you mean, and you’re right; the utility function constructed will depend on how the data points are sampled. This isn’t quite the same as saying that the result will depend on what results are actually available, though, unless knowledge about what results will be available is used to determine how to sample the data. This still seems like somewhat of a defect of inverse reinforcement learning, unless there ends up being a good case that some particular way of sampling the data is optimal for revealing underlying preferences and ignoring biases, or something like that.
That’s probably true, but on the other hand, you seem to want to pin the deviations of human behavior from VNM rationality on violations of the independence axiom, and it isn’t clear to me that this is the case (I don’t think the point you were making relies on this, so if you weren’t trying to make that claim then you can ignore this; it just seemed like you might be). There are situations where there are large framing effects (that is, whether A or B is preferred depends on how the options are presented, even if no other outcome C is being mixed in with them), and likely also violations of transitivity (where someone would say A>B, B>C, and C>A whenever you ask them about 2 of them without bringing up the third). It seems likely to me that most paradoxes of human decision-making have more to do with these than they do to violations of independence.