I really like this model of computation and how naturally it deals with counterfactuals, surprised it isn’t talked about more often.
This raises the issue of abstraction—the core problem of embedded agency.
I’d like to understand this claim better—are you saying that the core problem of embedded agency is relating high-level agent models (represented as causal diagrams) to low-level physics models (also represented as causal diagrams)?
I’m saying that the core problem of embedded agency is relating a high-level, abstract map to the low-level territory it represents. How can we characterize the map-territory relationship based on our knowledge of the map-generating process, and what properties does the map-generating process need to have in order to produce “accurate” & useful maps? How do queries on the map correspond to queries on the territory? Exactly what information is kept and what information is thrown out when the map is smaller than the territory? Good answers to these question would likely solve the usual reflection/diagonalization problems, and also explain when and why world-models are needed for effective goal-seeking behavior.
When I think about how to formalize these sort of questions in a way useful for embedded agency, the minimum requirements are something like:
need to represent physics of the underlying world
need to represent the cause-and-effect process which generates a map from a territory
need to run counterfactual queries on the map (e.g. in order to do planning)
need to represent sufficiently general computations to make agenty things possible
… and causal diagrams with symmetry seem like the natural class to capture all that.
I really like this model of computation and how naturally it deals with counterfactuals, surprised it isn’t talked about more often.
I’d like to understand this claim better—are you saying that the core problem of embedded agency is relating high-level agent models (represented as causal diagrams) to low-level physics models (also represented as causal diagrams)?
I’m saying that the core problem of embedded agency is relating a high-level, abstract map to the low-level territory it represents. How can we characterize the map-territory relationship based on our knowledge of the map-generating process, and what properties does the map-generating process need to have in order to produce “accurate” & useful maps? How do queries on the map correspond to queries on the territory? Exactly what information is kept and what information is thrown out when the map is smaller than the territory? Good answers to these question would likely solve the usual reflection/diagonalization problems, and also explain when and why world-models are needed for effective goal-seeking behavior.
When I think about how to formalize these sort of questions in a way useful for embedded agency, the minimum requirements are something like:
need to represent physics of the underlying world
need to represent the cause-and-effect process which generates a map from a territory
need to run counterfactual queries on the map (e.g. in order to do planning)
need to represent sufficiently general computations to make agenty things possible
… and causal diagrams with symmetry seem like the natural class to capture all that.