What kind of object is Q? (I assume its not a string.) Are you directly specifying a distribution of preferences conditioned on observations? Are you specifying a distribution over observations conditioned on preferences and then using inference?
I assume the second case. So given that Q is a predictive model, why wouldn’t you also use Q as your model for planning? What is the advantage of using two separate models? Has anyone proposed using separate models in this way?
To the extent that your model Q is bad, it seems like you are just doomed to perform badly, and the you either need to abandon the model-based approach or come up with a better model. Adding a second model P doesn’t sound promising at face value.
It may be interesting or useful to have two models in this way, but I think it’s an unusual architecture that requires some discussion.
What kind of object is Q? (I assume its not a string.) Are you directly specifying a distribution of preferences conditioned on observations? Are you specifying a distribution over observations conditioned on preferences and then using inference?
I assume the second case. So given that Q is a predictive model, why wouldn’t you also use Q as your model for planning? What is the advantage of using two separate models? Has anyone proposed using separate models in this way?
To the extent that your model Q is bad, it seems like you are just doomed to perform badly, and the you either need to abandon the model-based approach or come up with a better model. Adding a second model P doesn’t sound promising at face value.
It may be interesting or useful to have two models in this way, but I think it’s an unusual architecture that requires some discussion.