Recently, MIRI researcher Scott Garrabrant has publicized his work on finite factored sets. It allegedly offers a way to understand agency and causality in a set-up like the causal graphs championed by Judea Pearl. Unfortunately, the definition of conditional orthogonality is very confusing. I’m not aware of any public examples of people demonstrating that they understand it, but I didn’t really understand it until an hour ago, and I’ve heard others say that it went above their heads. So, I’d like to give an example of it here.
In a finite factored set, you have your base set S, and a set B of ‘factors’ of your set. In my case, the base set S will be four-dimensional space—I’m sorry, I know that’s one more dimension than the number that well-adjusted people can visualize, but it really would be a much worse example if I were restricted to three dimensions. We’ll think of the points in this space as tuples (x1,x2,x3,x4) where each xi is a real number between, say, −2 and 2 [footnote 1]. We’ll say that X1 is the ‘factor’, aka partition, that groups points together based on what their value of x1 is, and similarly for X2, X3, and X4, and set B={X1,X2,X3,X4}. I leave it as an exercise for the reader to check whether this is in fact a finite factored set. Also, I’ll talk about the ‘value’ of partitions and factors—technically, I suppose you could say that the ‘value’ of some partition at a point is the set in the partition that contains the point, but I’ll use it to mean that, for example, the ‘value’ of X1 at point (x1,x2,x3,x4) is x1. If you think of partitions as questions where different points in S give different answers, the ‘value’ of a partition at a point is the answer to the question.
[EDIT: for the rest of the post, you might want to imagine S as points in space-time, where x4 represents the time, and (x1,x2,x3) represent spatial coordinates—for example, inside a room, where you’re measuring from the north-east corner of the floor. In this analogy, we’ll imagine that there’s a flat piece of sheet metal leaning on the floor against two walls, over that corner. We’ll try conditioning on that—so, looking only at points in space-time that are spatially located on that sheet—and see that distance left is no longer orthogonal to distance up, but that both are still orthogonal to time.]
Now, we’ll want to condition on the set E={(x1,x2,x3,x4)|x1+x2+x3=1}. The thing with E is that once you know you’re in E, x1 is no longer independent of x2, like it was before, since they’re linked together by the condition that x1+x2+x3=1. However, x4 has nothing to do with that condition. So, what’s going to happen is that conditioned on being in E, X1 is orthogonal to X4 but not to X2.
In order to show this, we’ll check the definition of conditional orthogonality, which actually refers to this thing called conditional history. I’ll write out the definition of conditional history formally, and then try to explain it informally: the conditional history of X given E, which we’ll write as h(X|E), is the smallest set of factors H⊆B satisfying the following two conditions:
For all s,t∈E, if s∼bt for all b∈H, then s∼Xt.
For all s,t∈E and r∈S, if r∼bs for all b∈H and r∼b′t for all b′∈B∖H, then r∈E.
Condition 1 means that, if you think of the partitions as carving up the set S, then the partition X doesn’t carve E up more finely than if you carved according to everything in h(X|E). Another way to say that is that if you know you’re in E, knowing everything in the conditional history of X in E tells you what the ‘value’ of X is, which hopefully makes sense.
Condition 2 says that if you want to know if a point is in E, you can separately consider the ‘values’ of the partitions in the conditional history, as well as the other partitions that are in B but not in the conditional history. So it’s saying that there’s no ‘entanglement’ between the partitions in and out of the conditional history regarding E. This is still probably confusing, but it will make more sense with examples.
Now, what’s conditional orthogonality? That’s pretty simple once you get conditional histories: X and Y are conditionally orthogonal given E if the conditional history of X given E doesn’t intersect the conditional history of Y given E. So it’s saying that once you’re in E, the things determining X are different to the things determining Y, in the finite factored sets way of looking at things.
Let’s look at some conditional histories in our concrete example: what’s the history of X1 given E? Well, it’s got to contain X1, because otherwise that would violate condition 1: you can’t know the value of X1 without being told the value of X1, even once you know you’re in E. But that can’t be the whole thing. Consider the point s=(0.5,0.4,0.4,0.7). If you just knew the value of X1 at s, that would be compatible with s actually being (0.5,0.25,0.25,1), which is in E. And if you just knew the values of X2, X3, and X4, you could imagine that s was actually equal to (0.2,0.4,0.4,0.7), which is also in E. So, if you considered the factors in {X1} separately to the other factors, you’d conclude that s could be in E - but it’s actually not! This is exactly the thing that condition 2 is telling us can’t happen. In fact, the conditional history of X1 given E is {X1,X2,X3}, which I’ll leave for you to check. I’ll also let you check that the conditional history of X2 given E is {X1,X2,X3}.
Now, what’s the conditional history of X4 given E? It has to include X4, because if someone doesn’t tell you X4 you can’t figure it out. In fact, it’s exactly {X4}. Let’s check condition 2: it says that if all the factors outside the conditional history are compatible with some point being in E, and all the factors inside the conditional history are compatible with some point being in E, then it must be in E. That checks out here: you need to know the values of all three of X1, X2, and X3 at once to know if something’s in E, but you get those together if you jointly consider those factors outside your conditional history, which is {X1,X2,X3}. So looking at (0.5,0.4,0.4,0.7), if you only look at the values that aren’t told to you by the conditional history, which is to say the first three numbers, you can tell it’s not in E and aren’t tricked. And if you look at (0.5,0.25,0.25,0.7), you look at the factors in {X4} (namely X4), and it checks out, you look at the factors outside {X4} and that also checks out, and the point is really in E.
Hopefully this gives you some insight into condition 2 of the definition of conditional history. It’s saying that when we divide factors up to get a history, we can’t put factors that are entangled by the set we’re conditioning on on ‘different sides’ - all the entangled factors have to be in the history, or they all have to be out of the history.
In summary: h(X1|E)=h(X2|E)={X1,X2,X3}, and h(X4|E)={X4}. So, is X1 orthogonal to X2 given E? No, their conditional histories overlap—in fact, they’re identical! Is X1 orthogonal to X4 given E? Yes, they have disjoint conditional histories.
Some notes:
In this case, X1 was already orthogonal to X4 before conditioning. It would be nice to come up with an example where two things that weren’t already orthogonal become so after conditioning. [EDIT: see my next post]
We didn’t really need the underlying set to be finite for this example to work, suggesting that factored sets don’t really need to be finite for all the machinery Scott discusses.
We did need the range of each variable to be bounded for this to work nicely. Because all the numbers need to be between −2 and 2, once you’re in E, if x1=2 then x2 can’t be bigger than 1, otherwise x3 can’t go negative enough to get the numbers to add up to 1. But if they could all be arbitrary real numbers, then even once you were in E, knowing x1 wouldn’t tell you anything about x2, but we’d still have that X1 wasn’t orthogonal to X2 given E, which would be weird.
[^1] I know what you’re saying—“That’s not a finite set! Finite factored sets have to be finite!” Well, if you insist, you can think of them as only the numbers between −2 and 2 with two decimal places. That makes the set finite and doesn’t really change anything. (Which suggests that a more expansive concept could be used instead of finite factored sets.)
A simple example of conditional orthogonality in finite factored sets
Link post
Recently, MIRI researcher Scott Garrabrant has publicized his work on finite factored sets. It allegedly offers a way to understand agency and causality in a set-up like the causal graphs championed by Judea Pearl. Unfortunately, the definition of conditional orthogonality is very confusing. I’m not aware of any public examples of people demonstrating that they understand it, but I didn’t really understand it until an hour ago, and I’ve heard others say that it went above their heads. So, I’d like to give an example of it here.
In a finite factored set, you have your base set S, and a set B of ‘factors’ of your set. In my case, the base set S will be four-dimensional space—I’m sorry, I know that’s one more dimension than the number that well-adjusted people can visualize, but it really would be a much worse example if I were restricted to three dimensions. We’ll think of the points in this space as tuples (x1,x2,x3,x4) where each xi is a real number between, say, −2 and 2 [footnote 1]. We’ll say that X1 is the ‘factor’, aka partition, that groups points together based on what their value of x1 is, and similarly for X2, X3, and X4, and set B={X1,X2,X3,X4}. I leave it as an exercise for the reader to check whether this is in fact a finite factored set. Also, I’ll talk about the ‘value’ of partitions and factors—technically, I suppose you could say that the ‘value’ of some partition at a point is the set in the partition that contains the point, but I’ll use it to mean that, for example, the ‘value’ of X1 at point (x1,x2,x3,x4) is x1. If you think of partitions as questions where different points in S give different answers, the ‘value’ of a partition at a point is the answer to the question.
[EDIT: for the rest of the post, you might want to imagine S as points in space-time, where x4 represents the time, and (x1,x2,x3) represent spatial coordinates—for example, inside a room, where you’re measuring from the north-east corner of the floor. In this analogy, we’ll imagine that there’s a flat piece of sheet metal leaning on the floor against two walls, over that corner. We’ll try conditioning on that—so, looking only at points in space-time that are spatially located on that sheet—and see that distance left is no longer orthogonal to distance up, but that both are still orthogonal to time.]
Now, we’ll want to condition on the set E={(x1,x2,x3,x4)|x1+x2+x3=1}. The thing with E is that once you know you’re in E, x1 is no longer independent of x2, like it was before, since they’re linked together by the condition that x1+x2+x3=1. However, x4 has nothing to do with that condition. So, what’s going to happen is that conditioned on being in E, X1 is orthogonal to X4 but not to X2.
In order to show this, we’ll check the definition of conditional orthogonality, which actually refers to this thing called conditional history. I’ll write out the definition of conditional history formally, and then try to explain it informally: the conditional history of X given E, which we’ll write as h(X|E), is the smallest set of factors H⊆B satisfying the following two conditions:
For all s,t∈E, if s∼bt for all b∈H, then s∼Xt.
For all s,t∈E and r∈S, if r∼bs for all b∈H and r∼b′t for all b′∈B∖H, then r∈E.
Condition 1 means that, if you think of the partitions as carving up the set S, then the partition X doesn’t carve E up more finely than if you carved according to everything in h(X|E). Another way to say that is that if you know you’re in E, knowing everything in the conditional history of X in E tells you what the ‘value’ of X is, which hopefully makes sense.
Condition 2 says that if you want to know if a point is in E, you can separately consider the ‘values’ of the partitions in the conditional history, as well as the other partitions that are in B but not in the conditional history. So it’s saying that there’s no ‘entanglement’ between the partitions in and out of the conditional history regarding E. This is still probably confusing, but it will make more sense with examples.
Now, what’s conditional orthogonality? That’s pretty simple once you get conditional histories: X and Y are conditionally orthogonal given E if the conditional history of X given E doesn’t intersect the conditional history of Y given E. So it’s saying that once you’re in E, the things determining X are different to the things determining Y, in the finite factored sets way of looking at things.
Let’s look at some conditional histories in our concrete example: what’s the history of X1 given E? Well, it’s got to contain X1, because otherwise that would violate condition 1: you can’t know the value of X1 without being told the value of X1, even once you know you’re in E. But that can’t be the whole thing. Consider the point s=(0.5,0.4,0.4,0.7). If you just knew the value of X1 at s, that would be compatible with s actually being (0.5,0.25,0.25,1), which is in E. And if you just knew the values of X2, X3, and X4, you could imagine that s was actually equal to (0.2,0.4,0.4,0.7), which is also in E. So, if you considered the factors in {X1} separately to the other factors, you’d conclude that s could be in E - but it’s actually not! This is exactly the thing that condition 2 is telling us can’t happen. In fact, the conditional history of X1 given E is {X1,X2,X3}, which I’ll leave for you to check. I’ll also let you check that the conditional history of X2 given E is {X1,X2,X3}.
Now, what’s the conditional history of X4 given E? It has to include X4, because if someone doesn’t tell you X4 you can’t figure it out. In fact, it’s exactly {X4}. Let’s check condition 2: it says that if all the factors outside the conditional history are compatible with some point being in E, and all the factors inside the conditional history are compatible with some point being in E, then it must be in E. That checks out here: you need to know the values of all three of X1, X2, and X3 at once to know if something’s in E, but you get those together if you jointly consider those factors outside your conditional history, which is {X1,X2,X3}. So looking at (0.5,0.4,0.4,0.7), if you only look at the values that aren’t told to you by the conditional history, which is to say the first three numbers, you can tell it’s not in E and aren’t tricked. And if you look at (0.5,0.25,0.25,0.7), you look at the factors in {X4} (namely X4), and it checks out, you look at the factors outside {X4} and that also checks out, and the point is really in E.
Hopefully this gives you some insight into condition 2 of the definition of conditional history. It’s saying that when we divide factors up to get a history, we can’t put factors that are entangled by the set we’re conditioning on on ‘different sides’ - all the entangled factors have to be in the history, or they all have to be out of the history.
In summary: h(X1|E)=h(X2|E)={X1,X2,X3}, and h(X4|E)={X4}. So, is X1 orthogonal to X2 given E? No, their conditional histories overlap—in fact, they’re identical! Is X1 orthogonal to X4 given E? Yes, they have disjoint conditional histories.
Some notes:
In this case, X1 was already orthogonal to X4 before conditioning. It would be nice to come up with an example where two things that weren’t already orthogonal become so after conditioning. [EDIT: see my next post]
We didn’t really need the underlying set to be finite for this example to work, suggesting that factored sets don’t really need to be finite for all the machinery Scott discusses.
We did need the range of each variable to be bounded for this to work nicely. Because all the numbers need to be between −2 and 2, once you’re in E, if x1=2 then x2 can’t be bigger than 1, otherwise x3 can’t go negative enough to get the numbers to add up to 1. But if they could all be arbitrary real numbers, then even once you were in E, knowing x1 wouldn’t tell you anything about x2, but we’d still have that X1 wasn’t orthogonal to X2 given E, which would be weird.
[^1] I know what you’re saying—“That’s not a finite set! Finite factored sets have to be finite!” Well, if you insist, you can think of them as only the numbers between −2 and 2 with two decimal places. That makes the set finite and doesn’t really change anything. (Which suggests that a more expansive concept could be used instead of finite factored sets.)