Research scientist, alignment team, Google DeepMind

# Jonathan Richens

# Incentives from a causal perspective

# Agency from a causal perspective

Thanks for commenting! This is an interesting question and answering it requires digging into some of the subtleties of causality. Unfortunately the time series framing you propose doesnt work because this time series data is not iid (the variable A = “the next number out of program 1” is not iid), while by definition the distributions P(A), P(B) and P(A,B) you are reasoning with are assuming iid. We really have to have iid here, otherwise we are trying to infer correlation from a single sample. By treating non-iid variables as iid we can see correlations where there are no correlations, but those correlations come from the fact that the next output depends on the previous output, not because the output of one program depends on the output of the other program.

We can fix this by imagining a slightly different setup that I think is faithful to your proposal. Basically the same thing but instead of computing pi, both the programs have in memory a random string of bits, with 0 or 1 occurring with probability

^{1}⁄_{2}for each bit. Both programs just read out the string. Let the string of random bits be identical for program 1 and 2. Now, we can describe each output of the programs as iid. If these are the same for both program, the outputs of the programs are perfectly correlated. And you are right, by looking at the output of one of the programs I can update by beliefs on the output of the other program.Then we need to ask, how do we generate this experiment? To get the string of random bits we have to sample a coin flip, and then make two copies of the outcome and send it to both programs. If we tried to do this with two coins separately at different ends of the universe, we would get diffrent bit strings. So the two programs have in their past light cones a shared source of randomness—this is the common cause.

In the example of the two programs, we have to be careful with what we mean by statistical correlation v.s. more standard / colloquial use of the term. Im assuming here when you say `the same program running on opposite ends of the universe, and their outputs would be the same’ that you are referring to a deterministic program (else, there would be no guarantee that the outputs were the same). But, if the output of the two programs is deterministic, then there can be no statistical correlation between them. Let A be the outcome of the first program and B the outcome of the second. To measure statistical correlation we have to run the two programs many times generating i.i.d. samples of A and B, and they are correlated if P(A, B) is not equal to P(A)P(B). But if the two programs are deterministic, say A = a and B = b with probability 1, then they are not statistically correlated, as P(A = a, B = b) = 1 and P(A = a)P(B = b) = 1. So to get some correlation the output of the programs have to be random. To have two random algorithms generating correlated outcomes, they need to share some randomness they can condition their outputs on, i.e. a common cause. With the two planets example, we run into the same problem again. (PS by correlation Reichenbach means statistical dependence here rather than e.g. Pearson correlation, but the same argument applies).

Broadly speaking, to point of the (perhaps confusing) reference in the article is to say that if we accept the laws of physics, then all the things we observe in the universe are ultimately generated by causal dynamics (e.g. the classical equations of motion being applied to some initial conditions.) We can always describe these causal dynamics + initial conditions using a causal model. So there is always `some causal model’ that describes our data.

Yes I can flip two independent coins a finite number of times and get strings that appear to be correlated. But in the asymptotic limit the probability they are the same (or correlated at all) goes to zero. Hence, two causally unrelated things can appear dependent for finite sample sizes. But when we have infinite samples (which is the limit we assume when making statements about probabilities) we get P(a,b) = P(a)P(b).