People are confused about agency because they are confused about agents, that is, what it means to be a separate entity that can act. The trouble is that we see no evidence for metaphysical or even physical separation. Instead, separation is ontological, a product of how we frame our observations of the world. And so if you go looking for an objective, ontology-independent understanding of agents and agency, you get confused because you notice the separation is in the map and not inherent to the territory.
I think that’s why people say it. They are genuinely confused.
This proves too much. If taken literally, it would imply that there are no differences, if separation was purely ontological. But there are clearly differences. Separations being partial, continuous, (spatial or temporal)-frequency-specific, or etc, doesn’t mean there’s no separation, or that it’s just a modeling choice. There’s an objective gap in the bond strength of molecules in a chair-floor system that makes the chairfloor naturally factor into chair and floor. if one could view the histogram of bond strength, one might be able to find others. And there might be some bonds formed between chair and floor, maybe a few strong bonds. But there’s an objective difference; a theory which insists no qualitative separations misses that, quantitatively, there are. Dismissing separability of stuff specifically for agents seems like a mistake to me, though I do agree there are many appealing-but-false ways to describe separability of agentic stuff.
We’re both chunks of matter in reality—arguable exactly how much of our environment to include in self-chunk but definitely still separate chunks, even in ontologies with continuous-valued identity—and we’re within at most 60 light milliseconds of each other, but the information flow within our brains is vastly higher than info flow through these comments. It doesn’t matter what ontology you use to describe that—as long as you describe it in enough detail, your description must end up including the gap in connection strengths of various kinds.
Maybe your description doesn’t call out the largest gap, but I claim there’s an objective sense in which indra’s net is not fully connected, even though every connection exists, in a similar sense to how a fully connected weighted graph produced from a disconnected weighted graph by summing with the fully connected, unweighted graph , as , is sort of still not “morally” a fully connected graph. (and that’s after assuming indra’s net obeys locality, I’m not sure if advanced buddhism accepts that, I’d guess it probably doesn’t conflict with it, but noobs online I found after a half second of searching seem confused whether it does.)
To be a bit pedantic (and rhetorical), how do you know? You have some model of the world. When you say there are clearly differences, you are saying this from the standpoint of your world model. In general this is not something I’d need to call out, but here it matters, because the very idea of separation is inherent to the very idea of modeling. If you don’t model, then there’s no separation, because there’s nothing to do the sorting into categories. You may hold a contingent belief in your model that, in the counterfactual where you and no one else did any modeling there would still be separation, but this is impossible to consider, because to consider it itself requires modeling the world.
(Tangental, but to be clear I would consider “modeling the world” a fairly basic act that happens any time information is created by a control system. The human version of this is just dramatically more complex than the 1 bit model a thermostat has.)
None of this is to say that modeling the world as if separation exists prior to ontology is not useful. In fact, it will lead to more accurate predictions and you should do it! The trouble is that this doesn’t suddenly let us jump to assuming that separation exists independent of ontology because we have no non-ontological way to observe it. This leaves assertions about the existence (or nonexistence) of separation beyond ontological metaphysical claims about which we cannot know.
Thus why I argue that the idea of a separate agent is confusing. Perhaps I was a bit too strong in saying that we see “no evidence for [...] physical separation”, as I see how without laying out the above argument this could be interpreted as an argument against the value of physical modeling, which was not my intent.
I don’t feel particularly confused about what it means to be an agent. I certainly have no problem pointing out agents in the world. The category has some edge cases, like all other categories. I don’t really expect a very precise definition for agents to be either possible or necessary for AI safety. The line of argument through “what is an agent?” is one of the least persuasive to me, for working on agent foundations.
I am more confused about how agents work, but that is a different issue.
Agents are an empirical cluster that share certain properties such as learning and planning to steer towards narrow targets. I don’t think there is a natural sharp definition. I’d prefer to discuss the agent-like (safety-relevant) properties of particular algorithms and instantiations, rather than categorize edge cases.
But for fun,
A very weak agent, or not an agent, not strongly opinionated.
Yes
The agent frame seems unlikely to be useful here, since a cell membrane performs multiple loosely coupled functions which only make sense in the context of the e coli.
Yes
Yes
Yes
Yes
Yes
Yes
If its software is any good, yes
Yes
The frame is sometimes useful and sometimes not.
Yes
Yes
No
There is no 16th element of the list
The frame is almost never useful and often misleading.
Yes
Sort of. It is a very general learning algorithm and (in itself) not a long term planning algorithm, more on the level of the thermometer. Though in a sense it contains agents.
I agree that we have strong intuitions about agents and agency. The problem is, those intuitions are difficult to make rigorous. The attempt to make the intuitions rigorous is the path down which confusion lies. If you forgo formal rigor, there need be no confusion, but one gives up much by doing so in the domain of AI safety.
That would be true for any AI safety plan that relies on a precise definition of agency. I agree that some safety approaches (like direct cosmopolitan value alignment) would probably need this, but I mostly don’t like those plans and think of agents as an empirical cluster.
People are confused about agency because they are confused about agents, that is, what it means to be a separate entity that can act. The trouble is that we see no evidence for metaphysical or even physical separation. Instead, separation is ontological, a product of how we frame our observations of the world. And so if you go looking for an objective, ontology-independent understanding of agents and agency, you get confused because you notice the separation is in the map and not inherent to the territory.
I think that’s why people say it. They are genuinely confused.
This proves too much. If taken literally, it would imply that there are no differences, if separation was purely ontological. But there are clearly differences. Separations being partial, continuous, (spatial or temporal)-frequency-specific, or etc, doesn’t mean there’s no separation, or that it’s just a modeling choice. There’s an objective gap in the bond strength of molecules in a chair-floor system that makes the chairfloor naturally factor into chair and floor. if one could view the histogram of bond strength, one might be able to find others. And there might be some bonds formed between chair and floor, maybe a few strong bonds. But there’s an objective difference; a theory which insists no qualitative separations misses that, quantitatively, there are. Dismissing separability of stuff specifically for agents seems like a mistake to me, though I do agree there are many appealing-but-false ways to describe separability of agentic stuff.
We’re both chunks of matter in reality—arguable exactly how much of our environment to include in self-chunk but definitely still separate chunks, even in ontologies with continuous-valued identity—and we’re within at most 60 light milliseconds of each other, but the information flow within our brains is vastly higher than info flow through these comments. It doesn’t matter what ontology you use to describe that—as long as you describe it in enough detail, your description must end up including the gap in connection strengths of various kinds.
Maybe your description doesn’t call out the largest gap, but I claim there’s an objective sense in which indra’s net is not fully connected, even though every connection exists, in a similar sense to how a fully connected weighted graph produced from a disconnected weighted graph by summing with the fully connected, unweighted graph , as , is sort of still not “morally” a fully connected graph. (and that’s after assuming indra’s net obeys locality, I’m not sure if advanced buddhism accepts that, I’d guess it probably doesn’t conflict with it, but noobs online I found after a half second of searching seem confused whether it does.)
To be a bit pedantic (and rhetorical), how do you know? You have some model of the world. When you say there are clearly differences, you are saying this from the standpoint of your world model. In general this is not something I’d need to call out, but here it matters, because the very idea of separation is inherent to the very idea of modeling. If you don’t model, then there’s no separation, because there’s nothing to do the sorting into categories. You may hold a contingent belief in your model that, in the counterfactual where you and no one else did any modeling there would still be separation, but this is impossible to consider, because to consider it itself requires modeling the world.
(Tangental, but to be clear I would consider “modeling the world” a fairly basic act that happens any time information is created by a control system. The human version of this is just dramatically more complex than the 1 bit model a thermostat has.)
None of this is to say that modeling the world as if separation exists prior to ontology is not useful. In fact, it will lead to more accurate predictions and you should do it! The trouble is that this doesn’t suddenly let us jump to assuming that separation exists independent of ontology because we have no non-ontological way to observe it. This leaves assertions about the existence (or nonexistence) of separation beyond ontological metaphysical claims about which we cannot know.
Thus why I argue that the idea of a separate agent is confusing. Perhaps I was a bit too strong in saying that we see “no evidence for [...] physical separation”, as I see how without laying out the above argument this could be interpreted as an argument against the value of physical modeling, which was not my intent.
I don’t feel particularly confused about what it means to be an agent. I certainly have no problem pointing out agents in the world. The category has some edge cases, like all other categories. I don’t really expect a very precise definition for agents to be either possible or necessary for AI safety. The line of argument through “what is an agent?” is one of the least persuasive to me, for working on agent foundations.
I am more confused about how agents work, but that is a different issue.
How do you define an agent?
Which of the following entities are agents; if they are agents, what kind?:
a thermostat
2. a rocket actively steered by a Kalman filter
3. the membrane of an e. Coli
4. nematode worm
5. An ant swarm
6. a dog
7. a human
8. a human that is alert and focused
9. Amazon, the company
10. A humanoid robot
11. A robot arm
12. the United Kingdom
13. Claude 4.2
14. Claude 4.2 being called by an ‘agentic workflow’ app.
15. the process of Evolution by natural selection
17. all of humanity over the last thousand years.
18. AIXI
19. a Solomonoff inductor
20. A logical inductor
21. a Szilard engine, i.e. Maxwell demon
22. etc
Agents are an empirical cluster that share certain properties such as learning and planning to steer towards narrow targets. I don’t think there is a natural sharp definition. I’d prefer to discuss the agent-like (safety-relevant) properties of particular algorithms and instantiations, rather than categorize edge cases.
But for fun,
A very weak agent, or not an agent, not strongly opinionated.
Yes
The agent frame seems unlikely to be useful here, since a cell membrane performs multiple loosely coupled functions which only make sense in the context of the e coli.
Yes
Yes
Yes
Yes
Yes
Yes
If its software is any good, yes
Yes
The frame is sometimes useful and sometimes not.
Yes
Yes
No
There is no 16th element of the list
The frame is almost never useful and often misleading.
Yes
Sort of. It is a very general learning algorithm and (in itself) not a long term planning algorithm, more on the level of the thermometer. Though in a sense it contains agents.
^
Sort of.
Type error
I agree that we have strong intuitions about agents and agency. The problem is, those intuitions are difficult to make rigorous. The attempt to make the intuitions rigorous is the path down which confusion lies. If you forgo formal rigor, there need be no confusion, but one gives up much by doing so in the domain of AI safety.
That would be true for any AI safety plan that relies on a precise definition of agency. I agree that some safety approaches (like direct cosmopolitan value alignment) would probably need this, but I mostly don’t like those plans and think of agents as an empirical cluster.