Dissolving the Problem of Induction

The Problem of Induction, first published by David Hume in 1739, is potentially a longstanding crack in the foundations of science. From LessWrong’s tag:Induction page:

Modern views of induction state that any form of reasoning where the conclusion isn’t necessarily entailed in the premises is a form of inductive reasoning [...] “The sun has always risen, so it will also rise tomorrow”. [...] Contrary to deduction, induction can be wrong since the conclusions depend on the way the world actually is, not merely on the logical structure of the argument.

There has historically been a problem with the justification of the validity of induction. Hume argued that the justification for induction could either be a deduction or an induction. Since deductive reasoning only results in necessary conclusions and inductions can fail, the justification for inductive reasoning could not be deductive. But any inductive justification would be circular.

When I was first taught the Problem of Induction as a college student, I felt the same way as Bertrand Russell:

[Russell] expressed the view that if Hume’s problem cannot be solved, “there is no intellectual difference between sanity and insanity”.

I felt unsure—maybe not on a gut level, but on an intellectual level—about the solidity of the literal ground under my feet. Sure, the ground has been solid every day of my life, but why should it continue to be solid for even another second? What’s the logic? What’s the justification?

I didn’t have a satisfactory answer at the time, so I just went on with my life. I was acting like a typical non-philosopher civilian, acting as if it’s okay to just step over a crack in the foundation of science, taking a leap of faith that the whole sane world won’t crumble like Russell warned, and going on with my other activities.

Then I read Eliezer’s 2008 post on the topic, titled “Where Recursive Justification Hits Bottom”:

Why do I believe that the Sun will rise tomorrow?

Because I’ve seen the Sun rise on thousands of previous days.

Ah… but why do I believe the future will be like the past?

Even if I go past the mere surface observation of the Sun rising, to the apparently universal and exceptionless laws of gravitation and nuclear physics, then I am still left with the question: “Why do I believe this will also be true tomorrow?”

I could appeal to Occam’s Razor, the principle of using the simplest theory that fits the facts… but why believe in Occam’s Razor? Because it’s been successful on past problems? But who says that this means Occam’s Razor will work tomorrow?

Eliezer explains that he’s comfortable using inductive reasoning to justify inductive reasoning because it’s a reflectively-stable equilibrium:

I start going around in a loop at the point where I explain, “I predict the future as though it will resemble the past on the simplest and most stable level of organization I can identify, because previously, this rule has usually worked to generate good results; and using the simple assumption of a simple universe, I can see why it generates good results; and I can even see how my brain might have evolved to be able to observe the universe with some degree of accuracy, if my observations are correct.”

He elaborates in a followup post that self-justifying reasoning is the best you can do when you’re reasoning about your own reasoning:

I don’t think that going around in a loop of justifications through the meta-level is the same thing as circular logic. I think the notion of “circular logic” applies within the object level, and is something that is definitely bad and forbidden, on the object level. Forbidding reflective coherence doesn’t sound like a good idea. But I haven’t yet sat down and formalized the exact difference—my reflective theory is something I’m trying to work out, not something I have in hand.

I understand Eliezer to be saying: Sure, the Problem of Induction is a problem we have to solve, a potential crack in the foundation of science that we have to do something about, but we can solve it with a totally fine one-off hack: we just paper over the crack by granting ourselves an axiom that we live in a universe where induction works. Once we paper over the crack with that axiom, we don’t have to fear that the crack will open up and swallow us up, because when we (humans and superintelligent AIs) introspect on the crack from within our papered-over world, it’s not a crack anymore; it has every appearance of a solid foundation.

I had figured this was probably the end of the story with the Problem of Induction. And I found it satisfying enough… Until today, when I read David Deutsch’s 1996 book, The Fabric of Reality.

Deutsch doesn’t exactly attempt to solve the Problem of Induction; he instead says there is no Problem of Induction.

How Scientific Theories Actually Work

First, Deutsch points out that “induction” is a terrible characterization of how actual scientific theories get generated and accepted:

It is hard to know where to begin in criticizing the inductivist conception of science — it is so profoundly false in so many different ways. Perhaps the worst flaw [of the inductivist conception of science], from my point of view, is the sheer non sequitur that a generalized prediction is tantamount to a new theory. [P. 59]

For example, it doesn’t do justice to Einstein’s two theories of relativity to say that his work consisted of extrapolating a bunch of similar observations. It’s more accurate to describe him as struggling to navigate the vastness of concept-space in order to locate a mathematical structure that fit various constraints, such as keeping the speed of light constant in every reference frame.

Furthermore, in many cases it only became possible to make a “generalization” or “extrapolation”, or claim that future observations would be “similar” to past ones, once the equations of relativity were available to operationalize those terms in a theory-specific way.

Eliezer himself says that it’s possible to hypothesize General Relativity not from a bunch of repetitive observations, but from a single observation examined by a sufficiently capable scientist:

A Bayesian superintelligence, hooked up to a webcam, would invent General Relativity as a hypothesis—perhaps not the dominant hypothesis, compared to Newtonian mechanics, but still a hypothesis under direct consideration—by the time it had seen the third frame of a falling apple. It might guess it from the first frame, if it saw the statics of a bent blade of grass.

Deutsch’s insight about the Problem of Induction, which he credits to Karl Popper, is that it makes a type error: the type of a Hypothesis is not a hand-wavy Generalized Prediction, it’s a detailed computational Model. And what a scientist’s Observation→Hypothesis function does isn’t generalization, it’s reverse engineering.

Our best scientific theories are the ones that yield the highest compression factor on our observations and predictions. They didn’t win a contest for “best application of induction” against other candidate theories; they won out either because the other theories were ruled out by observation, or because the other theories didn’t yield as high of a compression factor.

It’s Not About Past vs. Future

Consider Eliezer’s words again (emphasis mine):

I start going around in a loop at the point where I explain, “I predict the future as though it will resemble the past on the simplest and most stable level of organization I can identify, because previously, this rule has usually worked to generate good results; and using the simple assumption of a simple universe, I can see why it generates good results; and I can even see how my brain might have evolved to be able to observe the universe with some degree of accuracy, if my observations are correct.”

In the emphasized part, Eliezer is implicitly making the retreat from the obviously false claim that surface-level features of the universe always stay consistent, to the more defensible claim that the universe has a “simple and stable level of organization” where things always stay consistent.

But how defensible is it that “the future will resemble the past” on any level? Do we really want to try and salvage any form of this claim?

Deutsch points out that the only sense in which the future actually resembles the past, generally speaking, is a trivial one: a theory explains how multiple pieces of the universe (multiple regions of time, multiple regions of space, two forces, two trajectories, etc) are related in some way. We can trivially claim that a theory of gravity expresses a fundamental “similarity” between the motions of a bullet and a satellite, even though one falls and one orbits.

The key underlying assumption of Eliezer’s statement is: “I am able to identify simpler and more stable levels of organization in the universe than what meets the eye”. But if we grant that those levels of organization exist, we’re granting that the universe is compressible: that it has properties shared by different areas of space, properties shared by different types of matter, properties shared by different types of forces, properties shared by different trajectories of motion, etc.

Having granted all that, we don’t need to grant an additional license to “predict the future as though it will resemble the past”. Believing that a property is shared by different times of the universe—namely, the past and future—doesn’t require a special-case justification.

In Deutsch’s words:

The best existing theories, which cannot be abandoned lightly because they are the solutions of problems, contain predictions about the future. And these predictions cannot be severed from the theories’ other content[...] because that would spoil the theories’ explanatory power. Any new theory we propose must therefore either be consistent with these existing theories, which has implications for what the new theory can say about the future, or contradict some existing theories but address the problems thereby raised, giving alternative explanations, which again constrains what they can say about the future. [P. 162]

Solving vs. Dissolving*

Here’s what I see as solving vs. dissolving the Problem of Induction:

Eliezer attempted to solve the Problem of Induction by licensing a “reflective loop through the meta level”.

Deutsch dissolved the Problem of Induction by pointing out that induction doesn’t actually play a role in science. Science is a reverse-engineering exercise that doesn’t rely on the assumption that “the future will be similar to the past”.

When we understand the business of science as reverse-engineering a compressed model of the universe, I don’t think its justification relies on a “loop through the meta level”. Although, admittedly, it does rely on Occam’s Razor.

The Problem of Occam’s Razor

I believe Eliezer and others have to some degree conflated the original Problem of Induction with the easier problem of justifying an Occamian Prior. Eliezer writes:

There are possible minds in mind design space who have anti-Occamian and anti-Laplacian priors; they believe that simpler theories are less likely to be correct, and that the more often something happens, the less likely it is to happen again.

And when you ask these strange beings why they keep using priors that never seem to work in real life… they reply, “Because it’s never worked for us before!”

Now, one lesson you might derive from this, is “Don’t be born with a stupid prior.” This is an amazingly helpful principle on many real-world problems, but I doubt it will satisfy philosophers.

In my view, it’s a significant and under-appreciated milestone that we’ve reduced the original Problem of Induction to the problem of justifying Occam’s Razor. We’ve managed to drop two confusing aspects from the original PoI:

  1. We don’t have to justify using “similarity”, “resemblance”, or “collecting a bunch of confirming observations”, because we know those things aren’t key to how science actually works.

  2. We don’t have to justify “the future resembling the past” per se. We only have to justify that the universe allows intelligent agents to learn probabilistic models that are better than maximum-entropy belief states.

Qiaochu Yuan thinks the Problem of Occam’s Razor might even be solvable with a simple counting argument:

Roughly speaking, weak forms of Occam’s razor are inevitable because there just aren’t as many “simple” hypotheses as “complicated” ones, whatever “simple” and “complicated” mean, so “complicated” hypotheses just can’t have that much probability mass individually. (And in turn the asymmetry between simple and complicated is that simplicity is bounded but complexity isn’t.)

I feel like the Problem of Occam’s Razor is analogous to the problem of “placing math on a more secure foundation so that we really know for sure that 1+1=2”. Sure, it’s worth doing, but it’s not such a big crisis that we really have to wonder whether 1+1=2.

I feel much calmer and more confident about a quest to figure out how to place Occamian priors on a secure foundation, than I previously felt about a misguided quest to figure out why our universe should fundamentally owe us a future that’s similar to our past.

*Catchier than “Explaining vs. Explaining Away”, amirite?