Spooky Action at a Distance: The No-Communication Theorem

Previously in series: Bell’s Theorem: No EPR “Reality”

When you have a pair of entangled particles, such as oppositely polarized photons, one particle seems to somehow “know” the result of distant measurements on the other particle. If you measure photon A to be polarized at 0°, photon B somehow immediately knows that it should have the opposite polarization of 90°.

Einstein famously called this “spukhafte Fernwirkung” or “spooky action at a distance”. Einstein didn’t know about decoherence, so it seemed spooky to him.

Though, to be fair, Einstein knew perfectly well that the universe couldn’t really be “spooky”. It was a then-popular interpretation of QM that Einstein was calling “spooky”, not the universe itself.

Let us first consider how entangled particles look, if you don’t know about decoherence—the reason why Einstein called it “spooky”:

Suppose we’ve got oppositely polarized photons A and B, and you’re about to measure B in the 20° basis. Your probability of seeing B transmitted by the filter (or absorbed) is 50%.

But wait! Before you measure B, I suddenly measure A in the 0° basis, and the A photon is transmitted! Now, apparently, the probability that you’ll see B transmitted is 11.6%. Something has changed! And even if the photons are light-years away, spacelike separated, the change still occurs.

You might try to reply:

“No, nothing has changed—measuring the A photon has told you something about the B photon, you have gained knowledge, you have carried out an inference about a distant object, but no physical influence travels faster-than-light.

“Suppose I put two index cards into an envelope, one marked ‘+’ and one marked ‘-’. Now I give one envelope to you, and one envelope to a friend of yours, and you get in a spaceship and travel a few light-years away from each other, and then you open your envelope and see ‘+’. At once you know that your friend is holding the envelope marked ‘-’, but this doesn’t mean the envelope’s content has changed faster than the speed of light.

“You are committing a Mind Projection Fallacy; the envelope’s content is constant, only your local beliefs about distant referents change.”

Bell’s Theorem, covered yesterday, shows that this reply fails. It is not possible that each photon has an unknown but fixed individual tendency to be polarized a particular way. (Think of how unlikely it would seem, a priori, for this to be something any experiment could tell you!)

Einstein didn’t know about Bell’s Theorem, but the theory he was criticizing did not say that there were hidden variables; it said that the probabilities changed directly.

But then how fast does this influence travel? And what if you measure the entangled particles in such a fashion that, in their individual reference frames, each measurement takes place before the other?

These experiments have been done. If you think there is an influence traveling, it travels at least six million times as fast as light (in the reference frame of the Swiss Alps). Nor is the influence fazed if each measurement takes place “first” within its own reference frame.

So why can’t you use this mysterious influence to send signals faster than light?

Here’s something that, as a kid, I couldn’t get anyone to explain to me: “Why can’t you signal using an entangled pair of photons that both start out polarized up-down? By measuring A in a diagonal basis, you destroy the up-down polarization of both photons. Then by measuring B in the up-down/​left-right basis, you can with 50% probability detect the fact that a measurement has taken place, if B turns out to be left-right polarized.”

It’s particularly annoying that nobody gave me an answer, because the answer turns out to be simple: If both photons have definite polarizations, they aren’t entangled. There are just two different photons that both happen to be polarized up-down. Measuring one photon doesn’t even change your expectations about the other.

Entanglement is not an extra property that you can just stick onto otherwise normal particles! It is a breakdown of quantum independence. In classical probability theory, if you know two facts, there is no longer any logical dependence left between them. Likewise in quantum mechanics, two particles each with a definite state must have a factorizable amplitude distribution.

Or as old-style quantum theory put it: Entanglement requires superposition, which implies uncertainty. When you measure an entangled particle, you are not able to force your measurement result to take any particular value. So, over on the B end, if they do not know what you measured on A, their probabilistic expectation is always the same as before. (So it was once said).

But in old-style quantum theory, there was indeed a real and instantaneous change in the other particle’s statistics which took place as the result of your own measurement. It had to be a real change, by Bell’s Theorem and by the invisibly assumed uniqueness of both outcomes.

Even though the old theory invoked a non-local influence, you could never use this influence to signal or communicate with anyone. This was called the “no-signaling condition” or the “no-communication theorem”.

Still, on then-current assumptions, they couldn’t actually call it the “no influence of any kind whatsoever theorem”. So Einstein correctly labeled the old theory as “spooky”.

In decoherent terms, the impossibility of signaling is much easier to understand: When you measure A, one version of you sees the photon transmitted and another sees the photon absorbed. If you see the photon absorbed, you have not learned any new empirical fact; you have merely discovered which version of yourself “you” happen to be. From the perspective at B, your “discovery” is not even theoretically a fact they can learn; they know that both versions of you exist. When B finally communicates with you, they “discover” which world they themselves are in, but that’s all. The statistics at B really haven’t changed—the total Born probability of measuring either polarization is still just 50%!

A common defense of the old theory was that Special Relativity was not violated, because no “information” was transmitted, because the superluminal influence was always “random”. As some Hans de Vries fellow points out, information theory says that “random” data is the most expensive kind of data you can transmit. Nor is “random” information always useless: If you and I generate a million entangled particles, we can later measure them to obtain a shared key for use in cryptography—a highly useful form of information which, by Bell’s Theorem, could not have already been there before measuring.

But wait a minute. Decoherence also lets you generate the shared key. Does decoherence really not violate the spirit of Special Relativity?

Decoherence doesn’t allow “signaling” or “communication”, but it allows you to generate a highly useful shared key apparently out of nowhere. Does decoherence really have any advantage over the old-style theory on this one? Or are both theories equally obeying Special Relativity in practice, and equally violating the spirit?

A first reply might be: “The shared key is not ‘random’. Both you and your friend generate all possible shared keys, and this is a deterministic and local fact; the correlation only shows up when you meet.”

But this just reveals a deeper problem. The counter-objection would be: “The measurement that you perform over at A, splits both A and B into two parts, two worlds, which guarantees that you’ll meet the right version of your friend when you reunite. That is non-local physics—something you do at A, makes the world at B split into two parts. This is spooky action at a distance, and it too violates the spirit of Special Relativity. Tu quoque!”

And indeed, if you look at our quantum calculations, they are written in terms of joint configurations. Which, on reflection, doesn’t seem all that local!

But wait—what exactly does the no-communication theorem say? Why is it true? Perhaps, if we knew, this would bring enlightenment.

Here is where it starts getting complicated. I myself don’t fully understand the no-communication theorem—there are some parts I think I can see at a glance, and other parts I don’t. So I will only be able to explain some of it, and I may have gotten it wrong, in which case I pray to some physicist to correct me (or at least tell me where I got it wrong).

When we did the calculations for entangled polarized photons, with A’s polarization measured using a 30° filter, we calculated that the initial state

√(1/​2) * ( [ A=(1 ; 0) ∧ B=(0 ; 1) ] - [ A=(0 ; 1) ∧ B=(1; 0) ] )

would be decohered into a blob for

( -(√3)/​2 * √(1/​2) * [ A=(-(√3)/​2 ; 12) ∧ B=(0 ; 1) ] )
- ( 12 * √(1/​2) * [ A=(-(√3)/​2 ; 12) ∧ B=(1; 0) ] )

and symmetrically (though we didn’t do this calculation) another blob for

( 12 * √(1/​2) * [ A=(1/​2 ; (√3)/​2) ∧ B=(0 ; 1) ] )
- ( (√3)/​2 * √(1/​2) * [ A=(1/​2 ; (√3)/​2) ∧ B=(1; 0) ] )

These two blobs together add up, linearly, to the initial state, as one would expect. So what changed? At all?

What changed is that the final result at A, for the first blob, is really more like:

(Sensor-A-reads-”ABSORBED”) * (Experimenter-A-sees-”ABSORBED”) *
{ ( -(√3)/​2 * √(1/​2) * [ A=(-(√3)/​2 ; 12) ∧ B=(0 ; 1) ] )
-( 12 * √(1/​2) * [ A=(-(√3)/​2 ; 12) ∧ B=(1; 0) ] ) }

and correspondingly with the TRANSMITTED blob.

What changed is that one blob in configuration space, was decohered into two distantly separated blobs that can’t interact any more.

As we saw from the Heisenberg “Uncertainty Principle”, decoherence is a visible, experimentally detectable effect. That’s why we have to shield quantum computers from decoherence. So couldn’t the decohering measurement at A, have detectable consequences for B?

But think about how B sees the initial state:

√(1/​2) * ( [ A=(1 ; 0) ∧ B=(0 ; 1) ] - [ A=(0 ; 1) ∧ B=(1; 0) ] )

From B’s perspective, this state is already “not all that coherent”, because no matter what B does, it can’t make the A=(1 ; 0) and A=(0 ; 1) configurations cross paths. There’s already a sort of decoherence here—a separation that B can’t eliminate by any local action at B.

And as we’ve earlier glimpsed, the basis in which you write the initial state is arbitrary. When you write out the state, it has pretty much the same form in the 30° measuring basis as in the 0° measuring basis.

In fact, there’s nothing preventing you from writing out the initial state with A in the 30° basis and B in the 0° basis, so long as your numbers add up.

Indeed this is exactly what we did do, when we first wrote out the four terms in the two blobs, and didn’t include the sensor or experimenter.

So when A permanently decohered the blobs in the 30° basis, from B’s perspective, this merely solidified a decoherence that B could have viewed as already existing.

Obviously, this can’t change the local evolution at B (he said, waving his hands a bit).

Now this is only a statement about a quantum measurement that just decoheres the amplitude for A into parts, without A itself evolving in interesting new directions. What if there were many particles on the A side, and something happened on the A side that put some of those particles into identical configurations via different paths?

This is where linearity and unitarity come in. The no-communication theorem requires both conditions: in general, violating linearity or unitarity gives you faster-than-light signaling. (And numerous other superpowers, such as solving NP-complete problems in polynomial time, and possibly Outcome Pumps.)

By linearity, we can consider parts of the amplitude distribution separately, and their evolved states will add up to the evolved state of the whole.

Suppose that there are many particles on the A side, but we count up every configuration that corresponds to some single fixed state of B—say, B=(0 ; 1) or B=France, whatever. We’d get a group of components which looked like:

(AA=1 ∧ AB=2 ∧ AC=Fred ∧ B=France) +
(AA=2 ∧ AB=1 ∧ AC=Sally ∧ B=France) + …

Linearity says that we can decompose the amplitude distribution around states of B, and the evolution of the parts will add to the whole.

Assume that the B side stays fixed. Then this component of the distribution that we have just isolated, will not interfere with any other components, because other components have different values for B, so they are not identical configurations.

And unitary evolution says that whatever the measure—the integrated squared modulus—of this component, the total measure is the same after evolution at A, as before.

So assuming that B stays fixed, then anything whatsoever happening at A, won’t change the measure of the states at B (he said, waving his hands some more).

Nor should it matter whether we consider A first, or B first. Anything that happens at A, within some component of the amplitude distribution, only depends on the A factor, and only happens to the A factor; likewise with B; so the final joint amplitude distribution should not depend on the order in which we consider the evolutions (and he waved his hands a final time).

It seems to me that from here it should be easy to show no communication considering the simultaneous evolution of A and B. Sadly I can’t quite see the last step of the argument. I’ve spent very little time doing actual quantum calculations—this is not what I do for a living—or it would probably be obvious. Unless it’s more subtle than it appears, but anyway...

Anyway, if I’m not mistaken—though I’m feeling my way here by mathematical intuition—the no-communication theorem manifests as invariant generalized states of entanglement. From B’s perspective, they are entangled with some distant entity A, and that entanglement has an invariant shape that remains exactly the same no matter what happens at A.

To me, at least, this suggests that the apparent non-locality of quantum physics is a mere artifact of the representation used to describe it.

If you write a 3-dimensional vector as “30° west of north, 40° upward slope, and 100 meters long,” it doesn’t mean that the universe has a basic compass grid, or that there’s a global direction of up, or that reality runs on the metric system. It means you chose a convenient representation.

Physics, including quantum physics, is relativistically invariant: You can pick any relativistic frame you like, redo your calculations, and always get the same experimental predictions back out. That we know.

Now it may be that, in the course of doing your calculations, you find it convenient to pick some reference frame, any reference frame, and use that in your math. Greenwich Mean Time, say. This doesn’t mean there really is a central clock, somewhere underneath the universe, that operates on Greenwich Mean Time.

The representation we used talked about “joint configurations” of A and B in which the states of A and B were simultaneously specified. This means our representation was not relativistic; the notion of “simultaneity” is arbitrary. We assumed the universe ran on Greenwich Mean Time, in effect.

I don’t know what kind of representation would be (1) relativistically invariant, (2) show distant entanglement as invariant, (3) directly represent space-time locality, and (4) evolve each element of the new representation in a way that depended only on an immediate neighborhood of other elements.

But that representation would probably be a lot closer to the Tao.

My suspicion is that a better representation might take its basic mathematical objects as local states of entanglement. I’ve actually suspected this ever since I heard about holographic physics and the entanglement entropy bound. But that’s just raw speculation, at this point.

However, it is important that a fundamental representation be as local and as simple as possible. This is why e.g. “histories of the entire universe” make poor “fundamental” objects, in my humble opinion.

And it’s why I find it suspicious to have a representation for calculating quantum physics that talks about a relativistically arbitrary “joint configuration” of A and B, when it seems like each local position has an invariant “distant entanglement” that suffices to determine local evolution. Shouldn’t we be able to refactor this representation into smaller pieces?

Though ultimately you do have to retrieve the phenomenon where the experimenters meet again, after being separated by light-years, and discover that they measured the photons with opposite polarizations. Which is provably not something you can get from individual billiard balls bopping around.

I suspect that when we get a representation of quantum mechanics that is local in every way that the physics itself is local, it will be immediately obvious—right there in the representation—that things only happen in one place at a time.

Hence, no faster-than-light communicators. (Dammit!)

Now of course, all this that I have said—all this wondrous normality—relies on the decoherence viewpoint.

It relies on believing that when you measure at A, both possible measurements for A still exist, and are still entangled with B in a way that B sees as invariant.

All the amplitude in the joint configuration is undergoing linear, unitary, local evolution. None of it vanishes. So the probabilities at B are always the same from a global standpoint, and there is no supraluminal influence, period.

If you tried to “interpret” things any differently… well, the no-communication theorem would become a lot less obvious.

Part of The Quantum Physics Sequence

Next post: “Decoherence is Simple

Previous post: “Bell’s Theorem: No EPR ‘Reality’