The interpretation of quantum mechanics is a philosophical puzzle that was baffling physicists and philosophers for about a century. In my view, this confusion is a symptom of us lacking a rigorous theory of epistemology and metaphysics. At the same time, creating such a theory seems to me like a necessary prerequisite for solving the technical AI alignment problem. Therefore, once we created a candidate theory of metaphysics (Formal Computation Realism (FCR), formerly known as infra-Bayesian Physicalism), the interpretation of quantum mechanics stood out as a powerful test case. In the work presented in this post, we demonstrated that FCR indeed passes this test (at least to a first approximation).
What is so confusing about quantum mechanics? To understand this, let’s take a look at a few of the most popular pre-existing interpretations.
The Copenhagen Interpretation (CI) proposes a mathematical rule for computing the probabilities of observation sequences, via postulating the collapse of the wavefunction. For every observation, you can apply the Born Rule to compute the probabilities of different results, and once a result is selected, the wavefunction is “collapsed” by projecting it to the corresponding eigenspace.
CI seems satisfactory to a logical positivist: if all we need from a physical theory is computing the probabilities of observations, we have it. However, this is unsatisfactory for a decision-making agent if the agent’s utility function depends on something other than its direct observations. For such an agent, CI offers no well-defined way to compute expected utility. Moreover, while normally decoherence ensures that the observations of all agents are in some sense “consistent”, in principle it is theoretically possible to create a situation in which decoherence fails and CI will prescribe contradictory beliefs to different agents (as in the Wigner’s friend thought experiment).
In CI, the wavefunction is merely a book-keeping device with no deep meaning of its own. In contrast, the Many Worlds Interpretation (MWI) takes a realist metaphysical stance, postulating that the wavefunction describes the objective physical state of the universe. This, in principle, admits meaningful unobservable quantities on which the values of agents can depend. However, the MWI has no mathematical rule for computing probabilities of observation sequences. If all “worlds” exist at the same time, there’s no obvious reason to expect to see one of them rather than another. MWI proponents address this by handwaving into existence some “degree of reality” that some worlds posses more than others. However, the fundamental fact remains that there is no well-defined prescription for probabilities of observation sequences, unless we copy the prescription of CI: however the latter is inconsistent with the intent of MWI in cases when decoherence fails, such as Wigner’s friend.
In principle, we can defend an MWI-based decision-theory in which the utility function is a self-adjoint operator on the Hilbert space and we are maximizing its expectation in the usual quantum-mechanical sense. Such a decision-theory can avoid the need for a well-defined probability distribution over observation sequences. However, it would leave us with an “ontological crisis”: if our agent did not start out knowing quantum mechanics, how would it translate its values into this quantum mechanical form?[1]
The De Broglie-Bohm Interpretation (DBBI) proposes that in addition to the wavefunction, we should also postulate a classical trajectory following a time-evolution law that depends on the wavefunction. This results in a realist theory with a well-defined distribution over observation sequences. However, it comes with two major issues:
The observations have to be a function of the positional variables only (and not of the momenta variables). This is probably false even for humans and hardly applicable for arbitrary agents.
The theory is not Lorentz invariant. This is disturbing in and of itself, but in particular it implies that e.g. the experiences of humans moving at relativistic speeds compared to the preferred frame of reference (whatever that is) are devoid of value (i.e. such humans apparently have no qualia), if we circumvented problem #1 somehow.
In my view, the real source of all the confusion is the lack of rigorous metaphysics: we didn’t know (prior to this line of research), in full generality, what the type signature of a physical theory should be and how should we evaluate such a theory.
Enter Formal Computational Realism (FCR). According to FCR, the fundamental ontology in which all beliefs and values about the world should be expressed is computable logical facts plus the computational information content of the universe. The universe can contain information about computations (e.g., if someone calculated the 700th digit of pi, then the universe contains this information), and fundamentally this information is all there is[2]. Moreover, given an algorithmic description of a physical theory plus the epistemic state of an agent in relation to computable logical facts, it is possible to formally specify the computational information content that this physical theory implies, from the perspective of the agent. The latter operation is called the “bridge transform”.
To apply this to quantum mechanics, we need to choose a particular algorithmic description. The choice we settled on is fairly natural: We imagine all possible quantum observables as having marginal distributions that obey the Born rule, with the joint distribution being otherwise completely ambiguous (in the sense that imprecise probability allows distributions to be ambiguous, i.e. we have “Knightian uncertainty” about it). The latter is a natural choice, because quantum mechanics has no prescription for the joint distribution of noncommuting observables. The combined values of all observables is the “state” that the physical theory computes, and the agent’s policy is treated as an unknown logical fact on which the computation depends.
Applying the bridge transform to the above operationalization of quantum mechanics, we infer the computational information content of the universe according to quantum mechanics, and then use the latter to extract the probabilities of various agent experiences. What we discover is as follows:
As in MWI, multiple copies of the agent experiencing different “Everett branches” can coexist.
Unlike MWI, not all Everett branches are instantiated, but only some (according to some well-defined probability distribution).
Assuming decoherence, the probability distribution is s.t. from the perspective of any copy of the agent, the universe seems to obey CI probabilities. That is, any statistical test that the agent can run will converge to confirming the CI distribution in the limit with probability 1, for all copies of the agent.
As opposed to CI, all agents have “consistent” expectations even when decoherence fails. We haven’t fully formalized this claim, but it seems clear from the general nature of FCR, and we did it investigate it in the specific example of Wigner’s friend.
As opposed to most pre-existing interpretations, the resulting formalism has precisely defined decision-theoretic prescriptions for an agent in any “weird” (i.e. not-decohering) situation like e.g. Wigner’s friend. This only requires the agent’s values to be specified in the FCR ontology (and in particular allows the agent to assign value to their own experiences, in some arbitrary history-dependent way, and/or the experiences of particular other agents).
In conclusion, FCR passed a non-trivial test here. It was not obvious to me that it would: before Gergely figured out the details, I wasn’t sure that it’s going to work at all. As such, I believe this to be a milestone result. (With some caveats: e.g. it needs to be rechecked for the non-monotonic version of the framework.)
We imagine all possible quantum observables as having marginal distributions that obey the Born rule
Dumb question, but does this approach of yours cash out to representing quantum states as a probability distribution function? How’s this rich enough to represent interference of states and all that quantum phenomena absent from stochastic dynamics?
Strictly speaking, there’s no result saying you can’t represent quantum phenomena by stochastic dynamics (a.k.a. hidden variables). Indeed, e.g. the de Broglie-Bohm interpretation does exactly that. What does exist is Bell’s inequality, which implies that it’s impossible to represent quantum phenomena by local hidden variables (local = the distribution is the limit of causal graphs in which variables are localized in spacetime and causal connections only run along future-directed timelike (not superluminal) separations). Now, our framework doesn’t even fall in the domain of Bell’s inequality, since (i) we have supracontributions (in this post called “ultracontributions”) instead of ordinary probability distributions (ii) we have multiple co-existing “worlds”. AFAIK, Bell-inequality-based arguments against local hidden variables support neither i nor ii. As such, it is conceivable that our interpretation is in some sense “local”. On the other hand, I don’t know that it’s local and have no strong reason to believe it.
>The interpretation of quantum mechanics is a philosophical puzzle that was baffling physicists and philosophers for about a century. In my view, this confusion is a symptom of us lacking a rigorous theory of epistemology and metaphysics
They are symptoms of each other. Epistemology depends on ontology, ontology depends on the right interpretation of physics...which depends on epistemology.
>What is so confusing about quantum mechanics?
Recovering the appearance of a classical reality from a profoundly unclassical substructure.
>and CI will prescribe contradictory beliefs to different agents (as in the Wigner’s friend thought experiment).
It’s important to notice that Wigners Friend only implies an inconsistency about when collapse occurred: there is no inconsistency between what Wigner and the Friend actually measure. So CI is still consistent as an instrumentalist theory. Indeed, you can cast it as an instrumentalist theory, by refusing to speak about collapse as a real process, and only requiring that observers make sharp real-valued observations for some unknown reason—substituting an irrealist notion of measurement for collapse.
>In CI, the wavefunction is merely a book-keeping device with no deep meaning of its own. In contrast
If you are going to be realist about collapse, you need to be realist about WF’s—otherwise what is collapsing? Realism about both is consistent, irrealism about both is consistent.
>However, the MWI has no mathematical rule for computing probabilities of observation sequences. If all “worlds” exist at the same time, there’s no obvious reason to expect to see one of them rather than another. MWI proponents address this by handwaving into existence some “degree of reality” that some worlds posses more than others. However, the fundamental fact remains that there is no well-defined prescription for probabilities of observation sequences, unless we copy the prescription of CI: however the latter is inconsistent with the intent of MWI in cases when decoherence fails, such as Wigner’s friend.
“The” MWI is more than one thing. If you are talking about coherent superposition, then there is no need to invent a probabilistic weighting , since standard quantum mechemical measure (WF amplitude) gives you that via the Born rule. But we can tell.we are not in a macroscopic coherent superposition, because it would look weird. If you are talking about decoherent branching, it’s not clear how or whether branches are weighted. That might matter for the purposes of physics, since the experimenter is only operating in one branch, and any other can have no effect. But for some other purposes , such as large world ethics, or anthropics, it might be nice to have weights for branches.
>and fundamentally this information is all there is
If you want to do decision theory, and not just probability theory , you need ontology, not just epistemology. But why would the process stop there? Is decision theory the most ambitious goal you can have?
Most of the current debate about metaphysics centres on consciousness. Physicalism, of a substantial kind, materialism, seems unable to explain phenomenal consciousness—why should an insubstantial, information-only ontology fare any better? Rather than having anything extra to offer, it is more minimal. The same information can appear in different forms .. drinking the wine is not reading the label .. which is a strong hint that information is not there is.
The interpretation of quantum mechanics is a philosophical puzzle that was baffling physicists and philosophers for about a century. In my view, this confusion is a symptom of us lacking a rigorous theory of epistemology and metaphysics. At the same time, creating such a theory seems to me like a necessary prerequisite for solving the technical AI alignment problem. Therefore, once we created a candidate theory of metaphysics (Formal Computation Realism (FCR), formerly known as infra-Bayesian Physicalism), the interpretation of quantum mechanics stood out as a powerful test case. In the work presented in this post, we demonstrated that FCR indeed passes this test (at least to a first approximation).
What is so confusing about quantum mechanics? To understand this, let’s take a look at a few of the most popular pre-existing interpretations.
The Copenhagen Interpretation (CI) proposes a mathematical rule for computing the probabilities of observation sequences, via postulating the collapse of the wavefunction. For every observation, you can apply the Born Rule to compute the probabilities of different results, and once a result is selected, the wavefunction is “collapsed” by projecting it to the corresponding eigenspace.
CI seems satisfactory to a logical positivist: if all we need from a physical theory is computing the probabilities of observations, we have it. However, this is unsatisfactory for a decision-making agent if the agent’s utility function depends on something other than its direct observations. For such an agent, CI offers no well-defined way to compute expected utility. Moreover, while normally decoherence ensures that the observations of all agents are in some sense “consistent”, in principle it is theoretically possible to create a situation in which decoherence fails and CI will prescribe contradictory beliefs to different agents (as in the Wigner’s friend thought experiment).
In CI, the wavefunction is merely a book-keeping device with no deep meaning of its own. In contrast, the Many Worlds Interpretation (MWI) takes a realist metaphysical stance, postulating that the wavefunction describes the objective physical state of the universe. This, in principle, admits meaningful unobservable quantities on which the values of agents can depend. However, the MWI has no mathematical rule for computing probabilities of observation sequences. If all “worlds” exist at the same time, there’s no obvious reason to expect to see one of them rather than another. MWI proponents address this by handwaving into existence some “degree of reality” that some worlds posses more than others. However, the fundamental fact remains that there is no well-defined prescription for probabilities of observation sequences, unless we copy the prescription of CI: however the latter is inconsistent with the intent of MWI in cases when decoherence fails, such as Wigner’s friend.
In principle, we can defend an MWI-based decision-theory in which the utility function is a self-adjoint operator on the Hilbert space and we are maximizing its expectation in the usual quantum-mechanical sense. Such a decision-theory can avoid the need for a well-defined probability distribution over observation sequences. However, it would leave us with an “ontological crisis”: if our agent did not start out knowing quantum mechanics, how would it translate its values into this quantum mechanical form?[1]
The De Broglie-Bohm Interpretation (DBBI) proposes that in addition to the wavefunction, we should also postulate a classical trajectory following a time-evolution law that depends on the wavefunction. This results in a realist theory with a well-defined distribution over observation sequences. However, it comes with two major issues:
The observations have to be a function of the positional variables only (and not of the momenta variables). This is probably false even for humans and hardly applicable for arbitrary agents.
The theory is not Lorentz invariant. This is disturbing in and of itself, but in particular it implies that e.g. the experiences of humans moving at relativistic speeds compared to the preferred frame of reference (whatever that is) are devoid of value (i.e. such humans apparently have no qualia), if we circumvented problem #1 somehow.
In my view, the real source of all the confusion is the lack of rigorous metaphysics: we didn’t know (prior to this line of research), in full generality, what the type signature of a physical theory should be and how should we evaluate such a theory.
Enter Formal Computational Realism (FCR). According to FCR, the fundamental ontology in which all beliefs and values about the world should be expressed is computable logical facts plus the computational information content of the universe. The universe can contain information about computations (e.g., if someone calculated the 700th digit of pi, then the universe contains this information), and fundamentally this information is all there is[2]. Moreover, given an algorithmic description of a physical theory plus the epistemic state of an agent in relation to computable logical facts, it is possible to formally specify the computational information content that this physical theory implies, from the perspective of the agent. The latter operation is called the “bridge transform”.
To apply this to quantum mechanics, we need to choose a particular algorithmic description. The choice we settled on is fairly natural: We imagine all possible quantum observables as having marginal distributions that obey the Born rule, with the joint distribution being otherwise completely ambiguous (in the sense that imprecise probability allows distributions to be ambiguous, i.e. we have “Knightian uncertainty” about it). The latter is a natural choice, because quantum mechanics has no prescription for the joint distribution of noncommuting observables. The combined values of all observables is the “state” that the physical theory computes, and the agent’s policy is treated as an unknown logical fact on which the computation depends.
Applying the bridge transform to the above operationalization of quantum mechanics, we infer the computational information content of the universe according to quantum mechanics, and then use the latter to extract the probabilities of various agent experiences. What we discover is as follows:
As in MWI, multiple copies of the agent experiencing different “Everett branches” can coexist.
Unlike MWI, not all Everett branches are instantiated, but only some (according to some well-defined probability distribution).
Assuming decoherence, the probability distribution is s.t. from the perspective of any copy of the agent, the universe seems to obey CI probabilities. That is, any statistical test that the agent can run will converge to confirming the CI distribution in the limit with probability 1, for all copies of the agent.
As opposed to CI, all agents have “consistent” expectations even when decoherence fails. We haven’t fully formalized this claim, but it seems clear from the general nature of FCR, and we did it investigate it in the specific example of Wigner’s friend.
As opposed to most pre-existing interpretations, the resulting formalism has precisely defined decision-theoretic prescriptions for an agent in any “weird” (i.e. not-decohering) situation like e.g. Wigner’s friend. This only requires the agent’s values to be specified in the FCR ontology (and in particular allows the agent to assign value to their own experiences, in some arbitrary history-dependent way, and/or the experiences of particular other agents).
In conclusion, FCR passed a non-trivial test here. It was not obvious to me that it would: before Gergely figured out the details, I wasn’t sure that it’s going to work at all. As such, I believe this to be a milestone result. (With some caveats: e.g. it needs to be rechecked for the non-monotonic version of the framework.)
Note that de Blanc’s proposal is inapplicable here, since the quantum ontology is not a Markov decision process.
To be clear, this is just a vague informal description, FCR is an actual rigorous mathematical framework.
Dumb question, but does this approach of yours cash out to representing quantum states as a probability distribution function? How’s this rich enough to represent interference of states and all that quantum phenomena absent from stochastic dynamics?
Strictly speaking, there’s no result saying you can’t represent quantum phenomena by stochastic dynamics (a.k.a. hidden variables). Indeed, e.g. the de Broglie-Bohm interpretation does exactly that. What does exist is Bell’s inequality, which implies that it’s impossible to represent quantum phenomena by local hidden variables (local = the distribution is the limit of causal graphs in which variables are localized in spacetime and causal connections only run along future-directed timelike (not superluminal) separations). Now, our framework doesn’t even fall in the domain of Bell’s inequality, since (i) we have supracontributions (in this post called “ultracontributions”) instead of ordinary probability distributions (ii) we have multiple co-existing “worlds”. AFAIK, Bell-inequality-based arguments against local hidden variables support neither i nor ii. As such, it is conceivable that our interpretation is in some sense “local”. On the other hand, I don’t know that it’s local and have no strong reason to believe it.
>The interpretation of quantum mechanics is a philosophical puzzle that was baffling physicists and philosophers for about a century. In my view, this confusion is a symptom of us lacking a rigorous theory of epistemology and metaphysics
They are symptoms of each other. Epistemology depends on ontology, ontology depends on the right interpretation of physics...which depends on epistemology.
>What is so confusing about quantum mechanics?
Recovering the appearance of a classical reality from a profoundly unclassical substructure.
>and CI will prescribe contradictory beliefs to different agents (as in the Wigner’s friend thought experiment).
It’s important to notice that Wigners Friend only implies an inconsistency about when collapse occurred: there is no inconsistency between what Wigner and the Friend actually measure. So CI is still consistent as an instrumentalist theory. Indeed, you can cast it as an instrumentalist theory, by refusing to speak about collapse as a real process, and only requiring that observers make sharp real-valued observations for some unknown reason—substituting an irrealist notion of measurement for collapse.
>In CI, the wavefunction is merely a book-keeping device with no deep meaning of its own. In contrast
If you are going to be realist about collapse, you need to be realist about WF’s—otherwise what is collapsing? Realism about both is consistent, irrealism about both is consistent.
>However, the MWI has no mathematical rule for computing probabilities of observation sequences. If all “worlds” exist at the same time, there’s no obvious reason to expect to see one of them rather than another. MWI proponents address this by handwaving into existence some “degree of reality” that some worlds posses more than others. However, the fundamental fact remains that there is no well-defined prescription for probabilities of observation sequences, unless we copy the prescription of CI: however the latter is inconsistent with the intent of MWI in cases when decoherence fails, such as Wigner’s friend.
“The” MWI is more than one thing. If you are talking about coherent superposition, then there is no need to invent a probabilistic weighting , since standard quantum mechemical measure (WF amplitude) gives you that via the Born rule. But we can tell.we are not in a macroscopic coherent superposition, because it would look weird. If you are talking about decoherent branching, it’s not clear how or whether branches are weighted. That might matter for the purposes of physics, since the experimenter is only operating in one branch, and any other can have no effect. But for some other purposes , such as large world ethics, or anthropics, it might be nice to have weights for branches.
>and fundamentally this information is all there is
If you want to do decision theory, and not just probability theory , you need ontology, not just epistemology. But why would the process stop there? Is decision theory the most ambitious goal you can have?
Most of the current debate about metaphysics centres on consciousness. Physicalism, of a substantial kind, materialism, seems unable to explain phenomenal consciousness—why should an insubstantial, information-only ontology fare any better? Rather than having anything extra to offer, it is more minimal. The same information can appear in different forms .. drinking the wine is not reading the label .. which is a strong hint that information is not there is.