Setting up the “locality of goals” concept: let’s split the variables in the world model into observables XO, action variables XA, and latent variables XL. Note that there may be multiple stages of observations and actions, so we’ll only have subsets SO and SA of the observation/action variables in the decision problem. The Bayesian utility maximizer then chooses XASA to maximize
E[u(X)|XOSO,do(XASA)]
… but we can rewrite that as
E[EXL[u(X)|XO,XA]|XOSO,do(XASA)]
Defining a new utility function u′(XO,XA)=EXL[u(X)|XO,XA], the original problem is equivalent to:
E[u′(XO,XA)|XOSO,do(XASA)]
In English: given the original utility function on the (“non-local”) latent variables, we can integrate out the latents to get a new utility function defined only on the (“local”) observation & decision variables. The new utility function yields completely identical agent behavior to the original.
So observing agent behavior alone cannot possibly let us distinguish preferences on latent variables from preferences on the “local” observation & decision variables.
Setting up the “locality of goals” concept: let’s split the variables in the world model into observables XO, action variables XA, and latent variables XL. Note that there may be multiple stages of observations and actions, so we’ll only have subsets SO and SA of the observation/action variables in the decision problem. The Bayesian utility maximizer then chooses XASA to maximize
E[u(X)|XOSO,do(XASA)]
… but we can rewrite that as
E[EXL[u(X)|XO,XA]|XOSO,do(XASA)]
Defining a new utility function u′(XO,XA)=EXL[u(X)|XO,XA], the original problem is equivalent to:
E[u′(XO,XA)|XOSO,do(XASA)]
In English: given the original utility function on the (“non-local”) latent variables, we can integrate out the latents to get a new utility function defined only on the (“local”) observation & decision variables. The new utility function yields completely identical agent behavior to the original.
So observing agent behavior alone cannot possibly let us distinguish preferences on latent variables from preferences on the “local” observation & decision variables.