I could imagine an efficient algorithm that could be said to be approximating a Bayesian agent with a prior including the truth, but I don’t say that with much confidence.
I agree with the second bullet point, but I’m not so convinced this is prohibitively hard. That said, not only would we have to make our (arbitrarily chosen) p(obs | utlity fn) un-game-able, one reading of my original post is that we would also have to ensure that by the time the agent was no longer continuing to gain much information, it would already have to have a pretty good grasp on the true utility function. This requirement might reduce to a concept like identifiability of the optimal policy.
Identifiability of the optimal policy seems too strong: it’s basically fine if my household robot doesn’t figure out the optimal schedule for cleaning my house, as long as it’s cleaning it somewhat regularly. But I agree that conceptually we would want something like that.
I could imagine an efficient algorithm that could be said to be approximating a Bayesian agent with a prior including the truth, but I don’t say that with much confidence.
I agree with the second bullet point, but I’m not so convinced this is prohibitively hard. That said, not only would we have to make our (arbitrarily chosen) p(obs | utlity fn) un-game-able, one reading of my original post is that we would also have to ensure that by the time the agent was no longer continuing to gain much information, it would already have to have a pretty good grasp on the true utility function. This requirement might reduce to a concept like identifiability of the optimal policy.
Identifiability of the optimal policy seems too strong: it’s basically fine if my household robot doesn’t figure out the optimal schedule for cleaning my house, as long as it’s cleaning it somewhat regularly. But I agree that conceptually we would want something like that.