Because being exposed to ordered sensory data will rapidly promote the hypothesis that induction works

Not if the alternative hypothesis assigns about the same probability to the data up to the present. For example, an alternative hypothesis to the standard “the sun rises every day” is “the sun rises every day, until March 22, 2015″, and the alternative hypothesis assigns the same probability to the data observed until the present as the standard one does.

You also have to trust your memory and your ability to compute Solomonoff induction, both of which are demonstrably imperfect.

There’s an infinite number of alternative hypotheses like that and you need a new one every time the previous one gets disproven; so assigning so much probability to all of them, that they went on dominating Solomonoff induction on every round even after being exposed to large quantities of sensory information, would require that the remaining probability mass assigned to the prior for Solomonoff induction be less than exp(amount of sensory information), that is, super-exponentially tiny.

My brain parsed “super-exponentially tiny” as “arbitrarily low” or somesuch. I did not wonder why it specifically needed to be super-exponential. Hence this post served both to point out that I should have been confused, (I wouldn’t have understood why) and to dispel the confusion.

You could choose to single out a single alternative hypothesis that says the sun won’t rise some day in the future. The ratio between P(sun rises until day X) and P(sun rises every day) will not change with any evidence before day X. If initially you believed a 99% chance of “the sun rises every day until day X” and a 1% chance of Solomonoff induction’s prior, you would end up assigning more than a 99% probability to “the sun rises every day until day X”.

Solomonoff induction itself will give some significant probability mass to “induction works until day X” statements. The Kolmogorov complexity of “the sun rises until day X” is about the Kolmogorov complexity of “the sun rises every day” plus the Kolmogorov complexity of X (approximately log2(x)+2log2(log2(x))). Therefore, even according to Solomonoff induction, the “sun rises until day X” hypothesis will have a probability approximately proportional to P(sun rises every day) / (X log2(X)^2). This decreases subexponentially with X, and even slower if you sum this probability for all Y >= X.

In order to get exponential change in the odds, you would need to have repeatable independent observations that distinguish between Solomonoff induction and some other hypothesis. You can’t get that in the case of “sun rises every day until day X” hypotheses.

If you only assign significant probability mass to one changeover day, you behave inductively on almost all the days up to that point, and hence make relatively few epistemic errors. To put it another way, unless you assign superexponentially-tiny probability to induction ever working, the number of anti-inductive errors you make over your lifespan will be bounded.

If you only assign significant probability mass to one changeover day, you behave inductively on almost all the days up to that point, and hence make relatively few epistemic errors.

But even one epistemic error is enough to cause an arbitrarily large loss in utility. Suppose you think that with 99% probability, unless you personally join a monastery and stop having any contact with the outside world, God will put everyone who ever existed into hell on 1/1/2050. So you do that instead of working on making a positive Singularity happen. Since you can’t update away this belief until it’s too late, it does seem important to have “reasonable” priors instead of just a non-superexponentially-tiny probability to “induction works”.

But even one epistemic error is enough to cause an arbitrarily large loss in utility.

This is always true.

Since you can’t update away this belief until it’s too late, it does seem important to have “reasonable” priors instead of just a non-superexponentially-tiny probability to “induction works”.

I’d say more that besides your one reasonable prior you also need to not make various sorts of specifically harmful mistakes, but this only becomes true when instrumental welfare as well as epistemic welfare are being taken into account. :)

Do you think it’s useful to consider “epistemic welfare” independently of “instrumental welfare”? To me it seems that approach has led to a number of problems in the past.

Solomonoff Induction was historically justified a way similar to your post: you should use the universal prior, because whatever the “right” prior is, if it’s computable then substituting the universal prior will cost you only a limited number of epistemic errors. I think this sort of argument is more impressive/persuasive than it should be (at least for some people, including myself when I first came across it), and makes them erroneously think the problem of finding “the right prior” or “a reasonable prior” is already solved or doesn’t need to be solved.

Thinking that anthropic reasoning / indexical uncertainty is clearly an epistemic problem and hence ought to be solved within epistemology (rather than decision theory), leading for example to dozens of papers arguing over what is the right way to do Bayesian updating in the Sleeping Beauty problem.

Ok, I agree with this interpretation of “being exposed to ordered sensory data will rapidly promote the hypothesis that induction works”.

Yep! And for the record, I agree with your above paragraphs given that.

I would like to note explicitly for other readers that probability goes down proportionally to the exponential of Kolmogorov complexity, not proportional to Kolmogorov complexity. So the probability of the Sun failing to rise the next day really is going down at a noticeable rate, as jacobt calculates (1 / x log(x)^2 on day x). You can’t repeatedly have large likelihood ratios against a hypothesis or mixture of hypotheses and not have it be demoted exponentially fast.

“The sun rises every day” is much simpler information and computation than “the sun rises every day until Day X”. To put it in caricature, if hypothesis “the sun rises every day”is:

XXX1XXXXXXXXXXXXXXXXXXXXXXXXXX

(reading from the left)

then the hypothesis “the sun rises every day until Day X” is:

XXX0XXXXXXXXXXXXXXXXXXXXXX1XXX

And I have no idea if that’s even remotely the right order of magnitude, simply because I have no idea how many possible-days or counterfactual days we need to count, nor of how exactly the math should work out.

The important part is that for every possible Day X, it is equally balanced by the “the sun rises every day” hypothesis, and AFAICT this is one of those things implied by the axioms. So because of complexity giving you base rates, most of the evidence given by sunrise accrues to “the sun rises every day”, and the rest gets evenly divided over all non-falsified “Day X” (also, induction by this point should let you induce that Day X hypotheses will continue to be falsified).

In fact, the sun will not rise every day. It’s not clear if the physics where things can happen forever is simpler than physics where things cannot.

Point taken. I was oversimplifying it in my mind.

My (revised) claim is that the hypothesis where the sun rising every day until explosion / star death / heat death / planetary destruction / other common cataclysmic event of the types we usually expect to end the rising of the sun is a simpler one than any hypothesis where we observe the sun not rising before observing such a cataclysmic event or any evidence thereof (e.g. the Earth just happened to stop rotating during that 24-hour period, maybe because someone messed up their warp engine experiment or something)¹.

The “the odds are evenly divided between the two” part of the grandparent does need revision in light of this, though.

The fun part of this one is that it doesn’t mean the sun stops rising forever, either. And if the Earth also stopped revolving around the sun… well, then we have one hell of a problem. Not that yearly daylight cycles isn’t a problem for a crazy load of other reasons, that is. Diving face-first into a star just sounds a bit more unpleasant and existentially-dooming.

Are you being deliberately obtuse? When Laplace asked “what is the chance the Sun will rise tomorrow” he was obviously describing a 24 hour period.

The point is that concise hypotheses are trickier than they seem.

You’re making the argument that Solomonoff induction would select “the sun rises every day” over “the sun rises every day until day X”. I agree, assuming a reasonable prior over programs for Solomonoff induction. However, if your prior is 99% “the sun rises every day until day X”, and 1% “Solomonoff induction’s prior” (which itself might assign, say, 10% probability to the sun rising every day), then you will end up believing that the sun rises every day until day X. Eliezer asserted that in a situation where you assign only a small probability to Solomonoff induction, it will quickly dominate the posterior. This is false.

most of the evidence given by sunrise accrues to “the sun rises every day”, and the rest gets evenly divided over all non-falsified “Day X”

Not sure exactly what this means, but the ratio between the probabilities “the sun rises every day” and “the sun rises every day until day X” will not be affected by any evidence that happens before day X.

Not if the alternative hypothesis assigns about the same probability to the data up to the present. For example, an alternative hypothesis to the standard “the sun rises every day” is “the sun rises every day, until March 22, 2015″, and the alternative hypothesis assigns the same probability to the data observed until the present as the standard one does.

You also have to trust your memory and your ability to compute Solomonoff induction, both of which are demonstrably imperfect.

There’s an infinite number of alternative hypotheses like that and you need a new one every time the previous one gets disproven; so assigning so much probability to all of them, that they went on dominating Solomonoff induction on every round even after being exposed to large quantities of sensory information, would require that the remaining probability mass assigned to the prior for Solomonoff induction be less than exp(amount of sensory information), that is, super-exponentially tiny.

My brain parsed “super-exponentially tiny” as “arbitrarily low” or somesuch. I did not wonder why it specifically needed to be super-exponential. Hence this post served both to point out that I should have been confused, (I wouldn’t have understood why) and to dispel the confusion.

Something about that amuses me.

You could choose to single out a single alternative hypothesis that says the sun won’t rise some day in the future. The ratio between P(sun rises until day X) and P(sun rises every day) will not change with any evidence before day X. If initially you believed a 99% chance of “the sun rises every day until day X” and a 1% chance of Solomonoff induction’s prior, you would end up assigning more than a 99% probability to “the sun rises every day until day X”.

Solomonoff induction itself will give some significant probability mass to “induction works until day X” statements. The Kolmogorov complexity of “the sun rises until day X” is about the Kolmogorov complexity of “the sun rises every day” plus the Kolmogorov complexity of X (approximately log2(x)+2log2(log2(x))). Therefore, even according to Solomonoff induction, the “sun rises until day X” hypothesis will have a probability approximately proportional to P(sun rises every day) / (X log2(X)^2). This decreases subexponentially with X, and even slower if you sum this probability for all Y >= X.

In order to get exponential change in the odds, you would need to have repeatable independent observations that distinguish between Solomonoff induction and some other hypothesis. You can’t get that in the case of “sun rises every day until day X” hypotheses.

If you only assign significant probability mass to one changeover day, you behave inductively on almost all the days up to that point, and hence make relatively few epistemic errors. To put it another way, unless you assign superexponentially-tiny probability to induction ever working, the number of anti-inductive errors you make over your lifespan will be bounded.

But even one epistemic error is enough to cause an arbitrarily large loss in utility. Suppose you think that with 99% probability, unless you personally join a monastery and stop having any contact with the outside world, God will put everyone who ever existed into hell on 1/1/2050. So you do that instead of working on making a positive Singularity happen. Since you can’t update away this belief until it’s too late, it does seem important to have “reasonable” priors instead of just a non-superexponentially-tiny probability to “induction works”.

This is always true.

I’d say more that besides your one reasonable prior you also need to not make various sorts of specifically harmful mistakes, but this only becomes true when instrumental welfare as well as epistemic welfare are being taken into account. :)

Do you think it’s useful to consider “epistemic welfare” independently of “instrumental welfare”? To me it seems that approach has led to a number of problems in the past.

Solomonoff Induction was historically justified a way similar to your post: you should use the universal prior, because whatever the “right” prior is, if it’s computable then substituting the universal prior will cost you only a limited number of epistemic errors. I think this sort of argument is more impressive/persuasive than it should be (at least for some people, including myself when I first came across it), and makes them erroneously think the problem of finding “the right prior” or “a reasonable prior” is already solved or doesn’t need to be solved.

Thinking that anthropic reasoning / indexical uncertainty is clearly an epistemic problem and hence ought to be solved within epistemology (rather than decision theory), leading for example to dozens of papers arguing over what is the right way to do Bayesian updating in the Sleeping Beauty problem.

Ok, I agree with this interpretation of “being exposed to ordered sensory data will rapidly promote the hypothesis that induction works”.

Yep! And for the record, I agree with your above paragraphs given that.

I would like to note explicitly for other readers that probability goes down proportionally to the exponential of Kolmogorov complexity, not proportional to Kolmogorov complexity. So the probability of the Sun failing to rise the next day really is going down at a noticeable rate, as jacobt calculates (1 / x log(x)^2 on day x). You can’t repeatedly have large likelihood ratios against a hypothesis or mixture of hypotheses and not have it be demoted exponentially fast.

But… no.

“The sun rises every day” is much simpler information and computation than “the sun rises every day until Day X”. To put it in caricature, if hypothesis “the sun rises every day”is:

XXX1XXXXXXXXXXXXXXXXXXXXXXXXXX

(reading from the left)

then the hypothesis “the sun rises every day until Day X” is:

XXX0XXXXXXXXXXXXXXXXXXXXXX1XXX

And I have no idea if that’s even remotely the right order of magnitude, simply because I have no idea how many possible-days or counterfactual days we need to count, nor of how exactly the math should work out.

The important part is that for every possible Day X, it is equally balanced by the “the sun rises every day” hypothesis, and AFAICT this is one of those things implied by the axioms. So because of complexity giving you base rates, most of the evidence given by sunrise accrues to “the sun rises every day”, and the rest gets evenly divided over all non-falsified “Day X” (also, induction by this point should let you induce that Day X hypotheses will continue to be falsified).

In fact, the sun

will notrise every day. It’s not clear if the physics where things can happen forever is simpler than physics where things cannot.Point taken. I was oversimplifying it in my mind.

My (revised) claim is that the hypothesis where the sun rising every day until explosion / star death / heat death / planetary destruction / other common cataclysmic event of the types we usually expect to end the rising of the sun is a simpler one than any hypothesis where we observe the sun not rising

beforeobserving such a cataclysmic event or any evidence thereof (e.g. the Earth just happened to stop rotating during that 24-hour period, maybe because someone messed up their warp engine experiment or something)¹.The “the odds are evenly divided between the two” part of the grandparent does need revision in light of this, though.

The fun part of this one is that it doesn’t mean the sun stops rising forever, either. And if the Earth also stopped revolving around the sun… well,

thenwe have one hell of a problem. Not that yearly daylight cycles isn’t a problem for a crazy load of other reasons, that is. Diving face-first into a star just sounds a bit more unpleasant and existentially-dooming.Are you being deliberately obtuse? When Laplace asked “what is the chance the Sun will rise tomorrow” he was obviously describing a 24 hour period.

The point is that concise hypotheses are trickier than they seem.

You’re making the argument that Solomonoff induction would select “the sun rises every day” over “the sun rises every day until day X”. I agree, assuming a reasonable prior over programs for Solomonoff induction. However, if your prior is 99% “the sun rises every day until day X”, and 1% “Solomonoff induction’s prior” (which itself might assign, say, 10% probability to the sun rising every day), then you will end up believing that the sun rises every day until day X. Eliezer asserted that in a situation where you assign only a small probability to Solomonoff induction, it will quickly dominate the posterior. This is false.

Not sure exactly what this means, but the ratio between the probabilities “the sun rises every day” and “the sun rises every day until day X” will not be affected by any evidence that happens before day X.