Dear Vaniver, thank you for sharing your thoughts. You bring up some important points.
The Intentional Agency Experiment is an idealisation, a model that tries to capture what ‘intention’ is supposed to be. How to translate this to the real world is often ambiguous and sometimes difficult. Similar issues crop up all over applications of pure science&mathematics. ‘Real’ applications usually involve implicitly and explicitly many different theoretical frameworks and highly simplified models as well as practical knowledge, various mechanical tricks, approximation schemes, etc.
When I ask that R has a set of actions, this only makes sense within a certain framework. A rock does not have ‘actions’, and neither does a human within a suitably deterministic framework. So we have to be careful; the setup only works when we have a model of the world that is suitably coarse-grained and allows for actions & counterfactuals. Like causality, intention & agency seems to me intensely tied up with an incomplete and coarse-grained model of the world.
To clear any misunderstandings; if we have a physical object that at our level of coarse-graining may indeterministically evolve, for example a rock balancing on a mountain peak, we would not say it has possible actions. One could be under the impression that the actions that are considered are actually instantiated; but of course that is not what is meant. In the Intentional Agency Experiment R is only asked to give an action given a counterfactual (hypothetical) world. If you’d like you can read ‘potential action’ everywhere where I write ‘action’. Actions are defined when we have an agent that we can ask to consider hypothetical scenarios and outputs a certain ‘potential action’ given this counterfactual world.
We cannot ask a rock to consider hypothetical scenarios. Neither can we ask an ant to do so. Only a human or sophisticated robot can. Even a human or sophisticated robot will usually not consider just the ‘clean’ counterfactual P(G|A)=0 but will also implicitly assume many other facts about the world. When we ask the R to consider P(G|A)=0 we don’t want it to assume other facts about W. So one should consider a world where the action A is instantiated but an omnipotent being keeps G from happening at the last possible moment.
In practice, it is frequently difficult to ask agents to consider hypothetical counterfactuals and impossible to have them consider ‘clean’ counterfactuals (where all else is held fixed). Nevertheless, just like in Economics we assume Ceteris Paribus, considering highly idealised models&situations often turns out to be a useful tool.
Moreover, we may try to approximate/instantiate the Intentional Agency Experiment in the real world. However, sometimes those approximations may not be the ‘right’ implementation. As mentioned, an ant cannot be asked to consider hypothetical scenarios directly. Yet, we may try to ‘approximate’ the piece of information P(G|A)=0 by putting an obstacle in its way. If the ant tries and succeeds to overcome the obstacle the conclusion shouldn’t be that ‘it chose a different action’; rather the correct conclusion was that putting this obstacle in its way was not a sufficient implementation of the mathematical act of asking R to consider P(G|A)=0 .
Yes, in practice situations arise where the implementation of a model can be ambiguous, very hard to implement etc. These are exactly the problems engineers and experimental physicists deal with; and these are interesting and important problems. But it should not prevent us from constructing highly simplified models.
Like causality, intention & agency seems to me intensely tied up with an incomplete and coarse-grained model of the world.
This seems right to me; there’s probably a deep connection between multi-level world models and causality / choices / counterfactuals.
We cannot ask a rock to consider hypothetical scenarios. Neither can we ask an ant to do so.
This seems unclear to me. If I reduce intelligence to circuitry, it looks like the rock is the null circuit that does no information processing, the ant is a simple circuit that does some simple processing, and a human is an very complex circuit that does very complex processing. The rock has no sensors to vary, but the ant does, and thus we could investigate a meaningful counterfactual universe where the ant would behave differently were it presented with different stimuli.
Is the important thing here that the circuitry instantiate some consideration of counterfactual universes in the factual universe? I don’t know enough about ant biology to know whether or not they can ‘imagine’ things in the right sense, but if we consider the simplest circuit that I view as having some measure of ‘intelligence’ or ‘optimization power’ or whatever, the thermostat, it’s clear that the thermostat isn’t doing this sort of counterfactual reasoning (it simply detects whether it’s in state A or B and activates an actuator accordingly).
If so, this looks like trying to ground out ‘what are counterfactuals?’ in terms of the psychology of reasoning: it feels to me like I could have chosen to get something to drink or keep typing, and the interesting thing is where that feeling comes from (and what role it serves and so on). Maybe another way to think of this is something like “what are hypotheticals?”: when I consider a theorem, it seems like the theorem could be true or false, and the process of building out those internal worlds until one collapses is potentially quite different from the standard presentation of a world of Bayesian updating. Similarly, when I consider my behavior, it seems like I could take many actions, and then eventually some actions happen. Even if I never take action A (and never would have, for various low-level deterministic reasons), it’s still part of my hypothetical space, as considered in the real universe. Here, ‘actions I could take’ has some real instantiation, as ‘hypotheticals I’m considering implicitly or explicitly’, complete with my confusions about those actions (“oh, turns out that action was ‘choke on water’ instead of ‘drink water’. Oops.”), as opposed to some Platonic set of possible actions, and the thermostat that isn’t considering hypotheticals is rightly viewed as having ‘no actions’ even tho it’s more reactive than a rock.
This seems promising, but collides with one of the major obstacles I have in thinking about embedded agency; it seems like the descriptive problem of “how am I doing hypothetical reasoning?” is vaguely detached from the prescriptive question of “how should I be doing hypothetical reasoning?” or the idealized question of “what are counterfactuals?”. It’s not obvious that we have an idealized view of ‘set of possible actions’ to approximate, and if we build up from my present reasoning processes, it seems likely that there will be some sort of ontological shift corresponding to an upgrade that might break lots of important guarantees. That said, this may be the best we have to work with.
Dear Vaniver, thank you for sharing your thoughts. You bring up some important points.
The Intentional Agency Experiment is an idealisation, a model that tries to capture what ‘intention’ is supposed to be. How to translate this to the real world is often ambiguous and sometimes difficult. Similar issues crop up all over applications of pure science&mathematics. ‘Real’ applications usually involve implicitly and explicitly many different theoretical frameworks and highly simplified models as well as practical knowledge, various mechanical tricks, approximation schemes, etc.
When I ask that R has a set of actions, this only makes sense within a certain framework. A rock does not have ‘actions’, and neither does a human within a suitably deterministic framework. So we have to be careful; the setup only works when we have a model of the world that is suitably coarse-grained and allows for actions & counterfactuals. Like causality, intention & agency seems to me intensely tied up with an incomplete and coarse-grained model of the world.
To clear any misunderstandings; if we have a physical object that at our level of coarse-graining may indeterministically evolve, for example a rock balancing on a mountain peak, we would not say it has possible actions. One could be under the impression that the actions that are considered are actually instantiated; but of course that is not what is meant. In the Intentional Agency Experiment R is only asked to give an action given a counterfactual (hypothetical) world. If you’d like you can read ‘potential action’ everywhere where I write ‘action’. Actions are defined when we have an agent that we can ask to consider hypothetical scenarios and outputs a certain ‘potential action’ given this counterfactual world.
We cannot ask a rock to consider hypothetical scenarios. Neither can we ask an ant to do so. Only a human or sophisticated robot can. Even a human or sophisticated robot will usually not consider just the ‘clean’ counterfactual P(G|A)=0 but will also implicitly assume many other facts about the world. When we ask the R to consider P(G|A)=0 we don’t want it to assume other facts about W. So one should consider a world where the action A is instantiated but an omnipotent being keeps G from happening at the last possible moment.
In practice, it is frequently difficult to ask agents to consider hypothetical counterfactuals and impossible to have them consider ‘clean’ counterfactuals (where all else is held fixed). Nevertheless, just like in Economics we assume Ceteris Paribus, considering highly idealised models&situations often turns out to be a useful tool.
Moreover, we may try to approximate/instantiate the Intentional Agency Experiment in the real world. However, sometimes those approximations may not be the ‘right’ implementation. As mentioned, an ant cannot be asked to consider hypothetical scenarios directly. Yet, we may try to ‘approximate’ the piece of information P(G|A)=0 by putting an obstacle in its way. If the ant tries and succeeds to overcome the obstacle the conclusion shouldn’t be that ‘it chose a different action’; rather the correct conclusion was that putting this obstacle in its way was not a sufficient implementation of the mathematical act of asking R to consider P(G|A)=0 .
Yes, in practice situations arise where the implementation of a model can be ambiguous, very hard to implement etc. These are exactly the problems engineers and experimental physicists deal with; and these are interesting and important problems. But it should not prevent us from constructing highly simplified models.
This seems right to me; there’s probably a deep connection between multi-level world models and causality / choices / counterfactuals.
This seems unclear to me. If I reduce intelligence to circuitry, it looks like the rock is the null circuit that does no information processing, the ant is a simple circuit that does some simple processing, and a human is an very complex circuit that does very complex processing. The rock has no sensors to vary, but the ant does, and thus we could investigate a meaningful counterfactual universe where the ant would behave differently were it presented with different stimuli.
Is the important thing here that the circuitry instantiate some consideration of counterfactual universes in the factual universe? I don’t know enough about ant biology to know whether or not they can ‘imagine’ things in the right sense, but if we consider the simplest circuit that I view as having some measure of ‘intelligence’ or ‘optimization power’ or whatever, the thermostat, it’s clear that the thermostat isn’t doing this sort of counterfactual reasoning (it simply detects whether it’s in state A or B and activates an actuator accordingly).
If so, this looks like trying to ground out ‘what are counterfactuals?’ in terms of the psychology of reasoning: it feels to me like I could have chosen to get something to drink or keep typing, and the interesting thing is where that feeling comes from (and what role it serves and so on). Maybe another way to think of this is something like “what are hypotheticals?”: when I consider a theorem, it seems like the theorem could be true or false, and the process of building out those internal worlds until one collapses is potentially quite different from the standard presentation of a world of Bayesian updating. Similarly, when I consider my behavior, it seems like I could take many actions, and then eventually some actions happen. Even if I never take action A (and never would have, for various low-level deterministic reasons), it’s still part of my hypothetical space, as considered in the real universe. Here, ‘actions I could take’ has some real instantiation, as ‘hypotheticals I’m considering implicitly or explicitly’, complete with my confusions about those actions (“oh, turns out that action was ‘choke on water’ instead of ‘drink water’. Oops.”), as opposed to some Platonic set of possible actions, and the thermostat that isn’t considering hypotheticals is rightly viewed as having ‘no actions’ even tho it’s more reactive than a rock.
This seems promising, but collides with one of the major obstacles I have in thinking about embedded agency; it seems like the descriptive problem of “how am I doing hypothetical reasoning?” is vaguely detached from the prescriptive question of “how should I be doing hypothetical reasoning?” or the idealized question of “what are counterfactuals?”. It’s not obvious that we have an idealized view of ‘set of possible actions’ to approximate, and if we build up from my present reasoning processes, it seems likely that there will be some sort of ontological shift corresponding to an upgrade that might break lots of important guarantees. That said, this may be the best we have to work with.