In that case, “purely observational” would describe an expectation for behavior and not the actual pattern of behavior. This is not at all what the conversion I described involves.
Remember: I’m allowing unlimited memory, taking into account the full history of inputs and outputs (i.e. environmental information received and agent response).
In your example, the history X might be (for example) A(ab)B(bc)C(ca)A, where (pq) is the action that happens to cause the environment to produce Q after P. In this case, the behavioral function B(X) would yield (ab).
Meanwhile, a suitable utility function U(X) would just need to prefer all sequences where each input A is followed by (ab), and so on, to those that where that doesn’t hold. In the case of complete information, as your scenario entails, the utility function could just prefer sequences where B follows A; regardless, this trivially generates the behavior.
In that case, “purely observational” would describe an expectation for behavior and not the actual pattern of behavior. This is not at all what the conversion I described involves.
Remember: I’m allowing unlimited memory, taking into account the full history of inputs and outputs (i.e. environmental information received and agent response).
In your example, the history X might be (for example) A(ab)B(bc)C(ca)A, where (pq) is the action that happens to cause the environment to produce Q after P. In this case, the behavioral function B(X) would yield (ab).
Meanwhile, a suitable utility function U(X) would just need to prefer all sequences where each input A is followed by (ab), and so on, to those that where that doesn’t hold. In the case of complete information, as your scenario entails, the utility function could just prefer sequences where B follows A; regardless, this trivially generates the behavior.