… I was expecting you’d push back a bit, so I’m going to fill in the push-back I was expecting here.

Sam’s argument still generalizes beyond the case of graphical models. Our model is going to have some variables in it, and if we don’t know in advance where the agent will be at each timestep, then presumably we don’t know which of those variables (or which function of those variables, etc) will be our Markov blanket. On the other hand, if we knew which variables or which function of the variables were the blanket, then presumably we’d already know where the agent is, so presumably we’re already conditioning on something when we say “the agent’s boundary is a Markov blanket”.

I think that is a basically-correct argument. It doesn’t actually argue that agent boundaries aren’t Markov boundaries; I still think agent boundaries are basically Markov boundaries. But the argument implies that the most naive setup is missing some piece having to do with “where the agent is”.

Our model is going to have some variables in it, and if we don’t know in advance where the agent will be at each timestep, then presumably we don’t know which of those variables (or which function of those variables, etc) will be our Markov blanket.

No? A probabilistic model can just be a probability distribution over events, with no “random variables in it”. It seemed like your suggestion was to define the random variables later, “on top of” the probabilistic model, not as an intrinsic part of the model, so as to avoid the objection that a physics-ish model won’t have agent-ish variables in it.

So the random variables for our markov blanket can just be defined as things like skin surface temperature & surface lighting & so on; random variables which can be derived from a physics-ish event space, but not by any particularly simple means (since the location of these things keeps changing).

On the other hand, if we knew which variables or which function of the variables were the blanket, then presumably we’d already know where the agent is, so presumably we’re already conditioning on something when we say “the agent’s boundary is a Markov blanket”.

Again, no? If I know skin surface temperature and lighting conditions and so on all add up to a Markov blanket, I don’t thereby know where the skin is.

I think that is a basically-correct argument. It doesn’t actually argue that agent boundaries aren’t Markov boundaries; I still think agent boundaries are basically Markov boundaries. But the argument implies that the most naive setup is missing some piece having to do with “where the agent is”.

It seems like you agree with Sam way more than would naively be suggested by your initial reply. I don’t understand why.

When I talked with Sam about this recently, he was somewhat satisfied by your reply, but he did think there were a bunch of questions which follow. By giving up on the idea that the markov blanket can be “built up” from an underlying causal model, we potentially give up on a lot of niceness desiderata which we might have wanted. So there’s a natural question of how much you want to try and recover, which you could have gotten from “structural” markov blankets, and might be able to get some other way, but don’t automatically get from arbitrary markov blankets.

In particular, if I had to guess: causal properties? I don’t know about you, but my OP was mainly directed at Critch, and iiuc Critch wants the Markov blanket to have some causal properties so that we can talk about input/output. I also find it appealing for “agent boundaries” to have some property like that. But if the random variables are unrelated to a causal graph (which, again, is how I understood your proposal) then it seems difficult to recover anything like that.

I think the missing piece is some sort of notion of “who is on what team”, which I think of as “agentic segmentation”. That is, can we draw borders in meatspace between who is being influenced by the rhetoric of which other agents. If a strong and misaligned superintelligence came to exist, my threat model is that it would do all sorts of dishonorable things by proxy using its rhetorical influence, and we would either see that something weird was going on with the agentic segmentation, or we wouldn’t. If we did see it, we could probably trace the agent segment back to some primary mover or locate the computational device that is powering the rapidly expanding agent segment.

Ahhh that makes sense, thanks.

… I was expecting you’d push back a bit, so I’m going to fill in the push-back I was expecting here.

Sam’s argument still generalizes beyond the case of graphical models. Our model is going to have some variables in it, and if we don’t know

in advancewhere the agent will be at each timestep, then presumably we don’t know which of those variables (or which function of those variables, etc) will be our Markov blanket. On the other hand, if we knew which variables or which function of the variables were the blanket, then presumably we’d already know where the agent is, so presumably we’re already conditioning on something when we say “the agent’s boundary is a Markov blanket”.I think that is a basically-correct argument. It doesn’t actually argue that agent boundaries aren’t Markov boundaries; I still think agent boundaries

arebasically Markov boundaries. But the argument implies that the most naive setup is missing some piece having to do with “where the agent is”.I find your attempted clarification confusing.

No? A probabilistic model can just be a probability distribution over events, with no “random variables in it”. It seemed like your suggestion was to define the random variables later, “on top of” the probabilistic model, not as an intrinsic part of the model, so as to avoid the objection that a physics-ish model won’t have agent-ish variables in it.

So the random variables for our markov blanket can

just be defined asthings like skin surface temperature & surface lighting & so on; random variables which can be derived from a physics-ish event space, but not by any particularly simple means (since the location of these things keeps changing).Again, no? If I know skin surface temperature and lighting conditions and so on all add up to a Markov blanket, I don’t thereby know

where the skin is.It seems like you agree with Sam way more than would naively be suggested by your initial reply. I don’t understand why.

When I talked with Sam about this recently, he was somewhat satisfied by your reply, but he did think there were a bunch of questions which follow. By giving up on the idea that the markov blanket can be “built up” from an underlying causal model, we potentially give up on a lot of niceness desiderata which we might have wanted. So there’s a natural question of how much you want to try and recover, which you

could havegotten from “structural” markov blankets, andmight be able toget some other way, but don’t automatically get from arbitrary markov blankets.In particular, if I had to guess: causal properties? I don’t know about you, but my OP was mainly directed at Critch, and iiuc Critch wants the Markov blanket to have some

causalproperties so that we can talk about input/output. I also find it appealing for “agent boundaries” to have some property like that. But if the random variables are unrelated to a causal graph (which, again, is how I understood your proposal) then it seems difficult to recover anything like that.I think the missing piece is some sort of notion of “who is on what team”, which I think of as “agentic segmentation”. That is, can we draw borders in meatspace between who is being influenced by the rhetoric of which other agents. If a strong and misaligned superintelligence came to exist, my threat model is that it would do all sorts of dishonorable things by proxy using its rhetorical influence, and we would either see that something weird was going on with the agentic segmentation, or we wouldn’t. If we did see it, we could probably trace the agent segment back to some primary mover or locate the computational device that is powering the rapidly expanding agent segment.