Toward “timeless” continuous-time causal models
I’m a bit at a loss as to where to put this. I know the inferential gap is too great for it to go anywhere but here, and I know that the number of people on LW interested in this subject could be counted on one hand. The prerequisites would almost certainly be Timeless Causality and more mathematics than anyone is really interested in learning.
So, I apologize in advance if you read this and discover at the end it was a waste of your time. But at the same time, I need people who know about these things to talk about them with me, to ensure that I haven’t gone crazy… yet. And most importantly, I need to know the people who have done this before, so that I don’t have to do it. Google can’t find them.
There are currently some efforts to generalize the causal models of Pearl to continuous-time situations. Most of these attempts involve replacing some causal discrete variables Xi with time-dependent random variables Xi(t). Possibly due to memetic infection from Yudkowsky, I don’t think this is necessarily the correct approach. The philosophical power of Pearl’s theory comes from the fact that it is timeless, that ba’o vimcu ty bu.
In order to motivate my working definitions for where such a timeless continuous-time theory will go, I need to go back to classical causality and decide what a timeless formulation actually means, formally. Spoiler: it means replacing time-dependent evolution with a global flow on the phase space of the system. This is more or less in line with what is said in Timeless Physics with regard to the glimpse of “quantum mist” illustrated there.
The role of phase space
What is “timelessness”? The first thing I thought of after reading the timeless subsequence was, “What does a timeless formulation of the wave equation look like?” First of all, this was the right thought, because the wave equation is what I’ll call (after the fact) “classically causal” in a sense to be described soon. I wouldn’t have seen the timelessness in a different mathematical model, because not all mathematical models of reality preserve the underlying phenomena’s causal structure. On the other hand, this was the wrong thought, because the wave equation is not the simplest continuous-time system that would have led me to this formalization of timelessness. Unfortunately the one that is easier for me to see (Lagrangian mechanics) is harder for me to explain, so you’re stuck with a suboptimal explanation.
The wave equation models all sorts of wave-like phenomena: light, acoustic waves, earthquakes, and so on. If we take the speed of sound to be one (as physicists are wont to do), the dispersion relation is ω2 = k2. Such a dispersion relation satisfies the Kramers-Kronig relation. As it turns out, equations whose dispersion relation satisfies this condition satisfy what I’m calling “classical causality”, but what is more commonly known as finite speed of propagation — or, more physically speaking, the fact that signals stay within their light cone.
The most common problem associated with the wave equation is the Cauchy problem. At time zero, we specify the state of the system: its initial position and velocity at every point. Then the solution of the wave equation describes how that initial state evolves with time. From a more abstract point of view, this evolution is a curve in the space of all possible initial states. This space is commonly referred to in the specific case of the wave equation as “energy space”, which further illustrates why this example is a bit bad for pedagogical purposes. From now on, we’re only going to talk about phase space.
Here is where we can remove time from the equation. Instead of thinking of the wave equation as associating to every state in phase space a time-dependent curve issuing forth from it, we’re going to think of the wave equation as specifying a global flow on the whole of phase space, all at once. In summary, I am led to believe that timeless formulations amount to abstracting away the time-dependence of the system’s evolution as a flow on the phase space of the system. And to think, this insight only took three years to internalize, provided I’ve gotten it correct.
The situation for a causal model is harder. In part, because stochastic things have shoddy excuses for derivatives. For the moment, we’re going to take the easiest possible continuous-time system: our causal N variables of interest, Xi, take only real values. The space of all the possible states of the system is N-dimensional Euclidean space, which is easy enough to work with. I’m going to implicitly assume that causal variables evolve continuously; that is, the sidewalk doesn’t go from being completely dry to completely wet instantaneously. Things like light switches and push buttons can still be modeled practically by bump functions and the like, so I don’t see this as a real limitation.
The somewhat harder bullet to swallow is the assumption that the random variables are Markovian; that is, they are “memoryless” in the sense that only the present state determines the future. Pearl spends some time in Causality defending this assumption from criticism that it doesn’t apply to quantum systems — I believe this defense is reasonable. I believe that causal models are necessarily refinements of our beliefs about what is still for the most part a classical world, and so the Markov assumption is not necessarily unnatural.
The phase space of N-dimensional Euclidean space is known as the tangent bundle, which amounts to having an additional copy of N-space at every point. Morally speaking, the tangent bundle represents all the directions and speeds in which the system can evolve from any given state.
We need some data about how the system is supposed to evolve: what I will call the causal flow. As best as I can currently conjecture, this data should take the form of a “bundle” of probability measures P, one for each point in N-space, such that each probability measure P(x) is defined over the tangent copy of N-space attached to that point.
By analogy with the previous section, the time-evolution of the system is given by Lipschitz-continuous curves in N-space. (Lipschitz-continuous, because if we assume they are differentiable curves, the Markov assumption goes out the window.) In contrast with the discrete theory of causality, and as mentioned above, we don’t allow causal variables to “jump” spontaneously, and there is a limit to how sharply they can “turn”.
A useful thing to have around would be the probability that the system will evolve from one state to another via a specific choice of one of these curves. Lipschitz-continuous curves are rectifiable, and so one can recapitulate a sort of Riemann sum — if you’re interested, I have it formally written down in a .pdf, but the current format is unfriendly to maths. So for now, you’ll just have to take my word for it when I say I can define the probability of the flow following a specific path. From there, it’s just a path integral to defining the probability of getting from one state to another.
Where to go from here?
Given this causal flow, d-separation should arise as a geometrical condition — but perhaps only a local one, for the causal structure of the system can also evolve with time. To intervene in this system is to project it onto a certain hyperplane, presumably, in some yet-to-be-determined way. And finally, there ought to be some way to define counterfactuals, but my limited mathematical foresight has already run too thin.
BONUS: If you’ve made it this far and can’t think of anything else to say, I’m willing to Crocker-entertain probabilities that I’m insane and/or a crackpot.