Two More Decision Theory Problems for Humans (Wei Dai)
I think when I wrote that post, I was mostly thinking of the problems as human rationality problems, but seeing it here (and also this comment) reminds me that there’s an AI alignment angle as well. Specifically, in value learning it must be issue if the human and the AI do not share the same ontology or decision theory (especially if the human has non-consequentialist values). Are you aware of any literature on this? If so, perhaps some of the techniques can be transferred back into the realm of human rationality.
I think when I wrote that post, I was mostly thinking of the problems as human rationality problems, but seeing it here (and also this comment) reminds me that there’s an AI alignment angle as well. Specifically, in value learning it must be issue if the human and the AI do not share the same ontology or decision theory (especially if the human has non-consequentialist values). Are you aware of any literature on this? If so, perhaps some of the techniques can be transferred back into the realm of human rationality.
As far as I know, only MIRI has really engaged with this problem, and they have only talked about it as a problem, not suggested any solutions.