every policy is consistent with every belief-state given convoluted enough rewards/values.
every posterior distribution is consistent with every evidence given convoluted enough prior distribution.
So to make sense of an agent or even to recognize one, you must have pretty strong priors / extra info about them
This is true, but even worse:
every policy is consistent with every belief-state given convoluted enough rewards/values.
every posterior distribution is consistent with every evidence given convoluted enough prior distribution. So to make sense of an agent or even to recognize one, you must have pretty strong priors / extra info about them