# A note on the description complexity of physical theories

Eliezer wrote:

In physics, you can get ab­solutely clear-cut is­sues. Not in the sense that the is­sues are triv­ial to ex­plain. [...] But when I say “macro­scopic de­co­her­ence is sim­pler than col­lapse” it is ac­tu­ally strict sim­plic­ity; you could write the two hy­pothe­ses out as com­puter pro­grams and count the lines of code.

Every once in a while I come across some be­lief in my mind that clearly origi­nated from some­one smart, like Eliezer, and stayed un­ex­am­ined be­cause af­ter you hear and check 100 cor­rect state­ments from some­one, you’re not about to check the 101st quite as thor­oughly. The above quote is one of those be­liefs. In this post I’ll try to look at it closer and see what it re­ally means.

Imag­ine you have a phys­i­cal the­ory, ex­pressed as a com­puter pro­gram that gen­er­ates pre­dic­tions. A nat­u­ral way to define the Kol­mogorov com­plex­ity of that the­ory is to find the length of the short­est com­puter pro­gram that gen­er­ates your pro­gram, as a string of bits. Un­der this very nat­u­ral defi­ni­tion, the many-wor­lds in­ter­pre­ta­tion of quan­tum me­chan­ics is al­most cer­tainly sim­pler than the Copen­hagen in­ter­pre­ta­tion.

But imag­ine you re­fac­tor your pre­dic­tion-gen­er­at­ing pro­gram and make it shorter; does this mean the phys­i­cal the­ory has be­come sim­pler? Note that af­ter some in­nocu­ous re­fac­tor­ings of a pro­gram ex­press­ing some phys­i­cal the­ory in a rec­og­niz­able form, you may end up with a pro­gram that ex­presses a differ­ent set of phys­i­cal con­cepts. For ex­am­ple, if you take a pro­gram that calcu­lates clas­si­cal me­chan­ics in the La­grangian for­mal­ism, and ap­ply mul­ti­ple be­hav­ior-pre­serv­ing changes, you may end up with a pro­gram whose in­ter­nal struc­tures look dis­tinctly Hamil­to­nian.

Therein lies the rub. Do we re­ally want a defi­ni­tion of “com­plex­ity of phys­i­cal the­o­ries” that tells apart the­o­ries mak­ing the same pre­dic­tions? If our for­mal­ism says Hamil­to­nian me­chan­ics has a higher prior prob­a­bil­ity than La­grangian me­chan­ics, which is demon­stra­bly math­e­mat­i­cally equiv­a­lent to it, some­thing’s gone hor­ribly wrong some­where. And do we even want to define “com­plex­ity” for phys­i­cal the­o­ries that don’t make any pre­dic­tions at all, like “glar­ble flar­gle” or “there’s a cake just out­side the uni­verse”?

At this point, the re­quired fix to our origi­nal defi­ni­tion should be ob­vi­ous: cut out the mid­dle­man! In­stead of find­ing the short­est al­gorithm that writes your al­gorithm for you, find the short­est al­gorithm that out­puts the same pre­dic­tions. This new defi­ni­tion has many de­sir­able prop­er­ties: it’s in­var­i­ant to re­fac­tor­ings, doesn’t dis­crim­i­nate be­tween equiv­a­lent for­mu­la­tions of clas­si­cal me­chan­ics, and re­fuses to spec­ify a prior for some­thing you can never ever test by ob­ser­va­tion. Clearly we’re on the right track here, and the origi­nal defi­ni­tion was just an easy fix­able mis­take.

But this easy fix­able mis­take… was the en­tire rea­son for Eliezer “choos­ing Bayes over Science” and urg­ing us to do same. The many-wor­lds in­ter­pre­ta­tion makes the same testable pre­dic­tions as the Copen­hagen in­ter­pre­ta­tion right now. There­fore by the amended defi­ni­tion of “com­plex­ity”, by the right and proper defi­ni­tion, they are equally com­plex. The truth of the mat­ter is not that they ex­press differ­ent hy­pothe­ses with equal prior prob­a­bil­ity—it’s that they ex­press the same hy­poth­e­sis. I’ll be the first to agree that there are very good rea­sons to pre­fer the MWI for­mu­la­tion, like its ped­a­gog­i­cal sim­plic­ity and beauty, but K-com­plex­ity is not one of them. And there may even be good rea­sons to pledge your alle­giance to Bayes over the sci­en­tific method, but this is not one of them ei­ther.

ETA: now I see that, while the post is kinda tech­ni­cally cor­rect, it’s hor­ribly con­fused on some lev­els. See the com­ments by Daniel_Bur­foot and JGWeiss­man. I’ll write an ex­pla­na­tion in the dis­cus­sion area.

ETA 2: done, look here.

• MWI and Copen­hagen do not make the same pre­dic­tions in all cases, just in testable ones. There is a sim­ple pro­gram that makes the same pre­dic­tions as MWI in all cases. There ap­pears to be no com­pa­rably sim­ple pro­gram that makes the same pre­dic­tions as Copen­hagen in all cases. So, if you gave me some com­pli­cated test which could not be car­ried out to­day, but on which the pre­dic­tions of MWI and Copen­hagen differed, and asked me to make a pre­dic­tion about what would hap­pen if the ex­per­i­ment was some­how run (it seems likely that such ex­per­i­ments will be pos­si­ble at some point in the ex­tremely dis­tant fu­ture) I would pre­dict that MWI will be cor­rect with over­whelming prob­a­bil­ity. I agree that if some other “more com­pli­cated” the­ory made the same pre­dic­tions as MWI in ev­ery case, then K-com­plex­ity would not give good grounds to de­cide be­tween them.

I guess the fun­da­men­tal dis­agree­ment is that you think MWI and Copen­hagen are the same the­ory be­cause dis­crim­i­nat­ing be­tween them is right now far out of reach. But I think the ex­is­tence of any situ­a­tion where they make differ­ent hy­pothe­ses is pre­cisely suffi­cient to con­sider them differ­ent the­o­ries. I don’t know why “testable” (mean­ing testable in prac­tice, not in the­ory) was thrown in at the last minute, be­cause it does not seem to ap­pear any­where in the rest of the post.

If in­stead you are as­sert­ing that MWI and Copen­hagen make the same the­o­ret­i­cally testable pre­dic­tions, then I dis­agree as a mat­ter of fact. MWI as­serts that in­terfer­ence should be able to oc­cur on ar­bi­trary scales, in par­tic­u­lar on the scale of an en­tire planet or galaxy (even though such in­terfer­ence is spec­tac­u­larly difficult to en­g­ineer and/​or will have a very small effect on prob­a­bil­ity am­pli­tudes), while Copen­hagen seems to im­ply that it can­not oc­cur on any scale larger than a hu­man ob­server.

I wouldn’t be sur­prised if I am wrong on that ques­tion of fact, and it would cer­tainly be good for me to fix my er­ror now if I am.

• Copen­hagen seems to im­ply that it [in­terfer­ence] can­not oc­cur on any scale larger than a hu­man ob­server.

I’m always skep­ti­cal when the op­po­nent of an idea in­forms me of some­thing ridicu­lous that the idea “seems to im­ply”. If it were a defen­der of Copen­hagen draw­ing that im­pli­ca­tion, I would be more likely to join you in hoot­ing with dis­dain.

I am far from an ex­pert on fun­da­men­tal physics, but I seem to re­call some­one once pooh-poohing the no­tion that QM and Copen­hagen are in any sense tied to hu­man ob­servers. After all, said this au­thor, we use QM to ex­plain the big bang and there were no ob­servers there. Sorry, I don’t re­mem­ber who it was that made that point, so I can’t give you a quote.

Does any­one else re­mem­ber this bet­ter than me?

• I say “seems to im­ply” be­cause its not re­ally clear what Copen­hagen does or does not im­ply, be­cause it doesn’t re­ally make pre­dic­tions in ev­ery cir­cum­stance.

Copen­hagen im­plies that when we make a mea­sure­ment, the value we mea­sure is in fact the one true value. In par­tic­u­lar, there are not other wor­lds where the mea­sure­ment re­turned other val­ues. This is pre­cisely the thing that dis­t­in­guishes it from many wor­lds, which sug­gests that there are other wor­lds where the mea­sure­ment re­turned other val­ues.

By ac­cept­ing that only one value of the mea­sure­ment ac­tu­ally hap­pens, you re­ject the pos­si­bil­ity of one hu­man civ­i­liza­tion, where one thing was mea­sured, in­terfer­ing with a differ­ent hu­man civ­i­liza­tion, where a differ­ent thing was mea­sured (be­cause you don’t be­lieve the other hu­man civ­i­liza­tion even ex­ists).

• That is not a pre­dic­tion, it is a post­dic­tion, and it is a post­dic­tion that Copen­hagen and MWI agree on. And that is, when you make a mea­sure­ment, the re­sult you get is the only re­sult you get in the world that you are in. Copen­hagen and MWI agree on this.

• MWI im­plies that you could in­ter­act with other wor­lds where there was a differ­ent out­come, at least in prin­ci­ple. This is not a post­dic­tion be­cause we have never en­g­ineered such an in­tri­cate situ­a­tion , and MWI and Copen­hagen don’t agree on it. In fact this was pro­posed as a test of MWi vs. Copen­hagen; build a quan­tum com­puter, up­load a hu­man into it, and then run the ex­per­i­ment I de­scribed above (to my knowl­edge, this is ac­tu­ally the first pro­posed use of a quan­tum com­puter).

Ac­tu­ally this isn’t re­ally an “in prin­ci­ple” thing. If we ig­nore grav­ity (I sus­pect that if we un­der­stood grav­ity cor­rectly we wouldn’t have to) and as­sume the uni­verse has finitely many de­grees of free­dom, MWI pre­dicts that all of these world lines will even­tu­ally con­verge in the fu­ture. It will just be a long time in the fu­ture, af­ter all or­der in the uni­verse has eroded and en­tropy is de­creas­ing again. This is clearly not what Copen­hagen says. In Copen­hagen, it is pos­si­ble that we never again re­turn to the state of the uni­verse at the big bang. In MWI this is not pos­si­ble, be­cause all sys­tems with finitely many de­grees of free­dom are pe­ri­odic.

• Copen­hagen seems to im­ply that it [in­terfer­ence] can­not oc­cur on any scale larger than a hu­man ob­server.

[...]

I am far from an ex­pert on fun­da­men­tal physics, but I seem to re­call some­one once pooh-poohing the no­tion that QM and Copen­hagen are in any sense tied to hu­man observers

Copen­hagen im­plies that un­der some cir­cum­stances, in­terfer­ence stops. That’s all that can be meant by “col­lapse”. Maybe above some length scale; maybe above some crit­i­cal mass; maybe above some num­ber of in­ter­act­ing par­ti­cles—it’s fuzzy on the de­tails. And of course, if that scale hap­pens to be larger than, oh, say, a per­son, then you are branch­ing and then hav­ing your branches de­stroyed all the time.

So yes, if ev­ery­thing is al­lowed to in­terfere as naively im­plied by the Schröd­inger equa­tion, you’re not talk­ing about Copen­hagen, you’re talk­ing about MWI.

• Copen­hagen im­plies that un­der some cir­cum­stances, a non-de­ter­minis­tic and non-con­tin­u­ous pro­cess hap­pens. (I find the phrase “in­terfer­ence stops” as mis­lead­ing.) The cir­cum­stances aren’t defined by some scale, but, as Man­fred says in the other re­ply, by what “ob­server” means. There is a com­pletely analo­gous ques­tion in MWI, where, un­der some cir­cum­stances, branch­ing oc­curs.

Another thing is, if Copen­hagen = ob­jec­tive col­lapse in the LW par­lance, then MWI isn’t the only al­ter­na­tive to Copen­hagen.

• What is the “analo­gous ques­tion” in MWI? I can make pre­dic­tions in any situ­a­tion us­ing MWI with­out an­swer­ing any such ques­tions, whereas in Copen­hagen I make differ­ent pre­dic­tions de­pend­ing on what no­tion of “ob­ser­va­tion” I use. This is one ma­jor rea­son I pre­fer MWI.

• The analo­gous ques­tion is “when does branch­ing oc­cur?” Your pre­dic­tions in MWI de­pend on what no­tion of “ob­ser­va­tion” you use, it is only less ap­par­ent in the for­mu­la­tion. To ob­tain some mean­ingful pre­dic­tion, at least you have to spec­ify what are the ob­serv­ables and what sub­sys­tem of the world cor­re­sponds to the ob­server and his mind states.

But I am not com­pletely sure what you are speak­ing about. Maybe you can give a con­crete ex­am­ple where MWI gives a unique an­swer while col­lapse for­mu­la­tion doesn’t?

• Branch­ing is not a phys­i­cal phe­nomenon in MWI, it is a way hu­mans talk about nor­mal uni­tary evolu­tion on large scales. It is not in­volved in mak­ing pre­dic­tions, just in talk­ing about them.

The typ­i­cal ex­am­ple dis­t­in­guish­ing MWI and Copen­hagen is the fol­low­ing. Sup­pose I build a quan­tum com­puter which simu­lates a hu­man. I then perform the fol­low­ing ex­per­i­ment. I send an elec­tron through a slit, and have the quan­tum com­puter mea­sure which slit it went through (that is, I tell the hu­man be­ing simu­lated which slit it went through). I then stop the elec­tron, and let the simu­lated hu­man con­tem­plate his ob­ser­va­tion for a while. After­wards, it is still pos­si­ble for me to “un­com­pute” the simu­lated hu­man’s mem­ory (just as it is in prin­ci­ple pos­si­ble to un­com­pute a real hu­man’s state) and make him for­get which slit the elec­tron went through. The elec­tron then pro­ceeds to the screen and hits it. Is the elec­tron dis­tributed ac­cord­ing to an in­terfer­ence pat­tern, or not?

If you think the an­swer to that ques­tion is ob­vi­ous in Copen­hagen, be­cause the simu­lated hu­man is ob­vi­ously not an ob­server, then sup­pose in­stead that I re­place the simu­lated hu­man with a real hu­man, main­tained in such a care­fully con­trol­led en­vi­ron­ment that I can un­com­pute his ob­ser­va­tion (tech­ni­cally un­re­al­is­tic, but the­o­ret­i­cally perfectly pos­si­ble).

If the an­swer to that ques­tion is also ob­vi­ous, sup­pose I re­place the real hu­man with the en­tire planet earth.

MWI pre­dicts an in­terfer­ence pat­tern in all of these cases. How­ever, Copen­hagen’s pre­dic­tion seems to de­pend on ex­actly which of those ex­per­i­ments have “ob­ser­va­tion” in them. Can a quan­tum com­puter ob­serve? Can a sin­gle iso­lated hu­man ob­serve? Can an iso­lated planet ob­serve? Does col­lapse oc­cur pre­cisely when “there is no pos­si­ble way to un­com­pute the re­sult”? The last would give the same pre­dic­tions as MWI by de­sign, but is re­ally an as­tound­ingly com­plex ax­iom and prob­a­bly is never satis­fied.

• It’s not about scale, it’s about the­ory of mea­sure­ment and what “ob­server” means. If elec­tron 1 bounces off elec­tron 2 in Copen­hagen, elec­tron 1 sees elec­tron 2 as “col­lapsed” into one eigen­state. If elec­trons bounce in MWI, they see the en­tire spec­trum of each other. How­ever, this is re­ally just a change what we’re look­ing at—whether we want to “ob­serve” all the in­for­ma­tion about the elec­tron (MWI), or just one in­stance of that in­for­ma­tion (Copen­hagen). The rea­son to look at only one in­stance is be­cause this is what cor­re­sponds to what peo­ple see—it’s what quan­tum physics looks like from in­side.

I’m not aware of any ex­am­ples of in­terfer­ence that are not ex­plain­able by the or­di­nary in­ter­pre­ta­tion that uses col­lapse—I think it’s likely that some peo­ple are in­ter­chang­ing the differ­ent ideas of ob­server and not re­mem­ber­ing that to de­scribe en­tan­gled states in Copen­hagen you need to do more work than that. Once a state is en­tan­gled you can’t de­scribe a sin­gle (Copen­hagen) ob­server by a pure state, which is prob­a­bly what you’re think­ing when you think of “de­stroy­ing the other branches.”

• What you are de­scribing is to first com­pute the an­swer us­ing Many Wor­lds, and then figure out where to ap­ply the col­lapse in Copen­hagen to not af­fect any­thing.

• No.

What I am say­ing is to com­pute the an­swer us­ing quan­tum me­chan­ics.

The way to do it cor­rectly, Copen­hagen style, is to say “okay, the elec­tron goes through the plate with two holes in it. But since, from the per­spec­tive of the elec­tron, it can’t go through two holes, the state of the elec­tron on the other side should be en­tan­gled some­thing like |10> + |01>. If we fast-for­ward to the screen, we get an in­terfer­ence pat­tern”

The way to do it cor­rectly, MW style, is to say “okay, the elec­tron has equal prob­a­bil­ity of go­ing through each hole, so let’s split into two wor­lds with equal phase. The de­tec­tor will then ob­serve the su­per­po­si­tion of the two wor­lds, some­thing like |10> + |01>, ex­cept fast-for­warded to the screen, so there should be an in­terfer­ence pat­tern.”

If these two ap­proaches look similar, there’s a rea­son. And it’s not that one is crib­bing off the other! As you can see, in­tro­duc­ing en­tan­gle­ment in the Copen­hagen in­ter­pre­ta­tion was definitely not ar­bi­trary, but it is con­cep­tu­ally trick­ier than think­ing through the same pro­cess us­ing the MWI.

• Does your un­der­stand­ing of Copen­hagen Quan­tum Me­chan­ics re­ject the con­clu­sion of Many Wor­lds, that the uni­verse is in su­per­po­si­tion of many states, many of which can con­tain peo­ple, which can’t ob­serve each other?

If not, I think this has be­come an ar­gu­ment about defi­ni­tions.

• I’m ac­tu­ally pretty sure the Copen­hagen In­ter­pre­ta­tion isn’t com­plete/​co­her­ent enough to ac­tu­ally be turned into a com­puter pro­gram. It just waves it’s hands around the Mea­sure­ment Prob­lem. The Oc­cam’s Ra­zor jus­tifi­ca­tion for peo­ple want to make for Many Wor­lds needs to be made in com­par­i­son to the in­ter­pre­ta­tions vi­able com­peti­tors like de Broglie Bohm and com­pany.

• I’m ac­tu­ally pretty sure the Copen­hagen In­ter­pre­ta­tion isn’t com­plete/​co­her­ent enough to ac­tu­ally be turned into a com­puter pro­gram. It just waves it’s hands around the Mea­sure­ment Prob­lem.

I can’t ar­gue with that.

The Oc­cam’s Ra­zor jus­tifi­ca­tion for peo­ple want to make for Many Wor­lds needs to be made in com­par­i­son to the in­ter­pre­ta­tions vi­able com­peti­tors like de Broglie Bohm and com­pany.

Bohm’s the­ory is one of those hid­den vari­able the­o­ries, which ac­cord­ing to EPR must have some­thing like faster than light sig­nal­ing?

• Bohm’s the­ory is one of those hid­den vari­able the­o­ries, which ac­cord­ing to EPR must have some­thing like faster than light sig­nal­ing?

It is a hid­den vari­able the­ory and as such it is non-lo­cal but the non-lo­cal­ity doesn’t im­ply that we can use it for ftl sig­nal­ling. The main prob­lems are a.) you need to do weird things to it to make it Lorentz in­var­i­ant and B.) it is less par­si­mo­nious than Many Wor­lds (as all hid­den vari­able the­o­ries prob­a­bly will be since they add more vari­ables!). On the other hand it re­turns the Born prob­a­bil­ities (ED: which I guess I would ar­gue makes it par­si­mo­nious in a differ­ent way since it doesn’t have this added pos­tu­late).

I don’t re­ally know enough to make the eval­u­a­tion for my­self. But my sense is that we as a com­mu­nity have done way to much talk­ing about why MWI is bet­ter than CI (it ob­vi­ously is) and not nearly enough think­ing about the other al­ter­na­tives.

• Does your un­der­stand­ing of Copen­hagen Quan­tum Me­chan­ics re­ject the con­clu­sion [...] that the uni­verse is in su­per­po­si­tion of many states?

Yes, it does. The Copen­hagen in­ter­pre­ta­tion says that when you ob­serve the uni­verse, your ob­ser­va­tion be­comes right, and your model of the world should make a 100% cer­tain retro­d­ic­tion about what just hap­pened. This is math­e­mat­i­cally equiv­a­lent to let­ting the a MWI mod­eler know which world (or set of wor­lds with the same eigen­state of the ob­serv­able) they’re in at some time.

How­ever, in Copen­hagen, the uni­verse you ob­serve is all there “is.” If I ob­serve the elec­tron with spin up, there is no other me that ob­serves it with spin down. The prob­a­bil­ities in Copen­hagen are more Bayesian than fre­quen­tist. Mean­while in MWI the prob­a­bil­ities are fre­quen­cies of “ac­tual” peo­ple mea­sur­ing the elec­tron. But since there is no such thing as an out­side ob­server of the uni­verse (that’s the point), the differ­ence here doesn’t nec­es­sar­ily mean this isn’t an ar­gu­ment about defi­ni­tions. :P

• Your Copen­hagen In­ter­pre­ta­tion looks like start­ing with Many Wor­lds, and then re­ject­ing the im­plied in­visi­ble wor­lds as an ad­di­tional as­sump­tion about re­al­ity.

• My Copen­hagen in­ter­pre­ta­tion (the one I use to demon­strate ideas about the Copen­hagen in­ter­pre­ta­tion, not nec­es­sar­ily the in­ter­pre­ta­tion I use when think­ing about prob­lems) looks like the Copen­hagen In­ter­pre­ta­tion. And yes, it is close to what you said. But it’s not quite that sim­ple, since all the math is pre­served be­cause of stuff like en­tan­gle­ment.

• Oh, I’m all in fa­vor of MWI. I just don’t think we should claim that it makes differ­ent pre­dic­tions from Copen­hagen based sim­ply on our scorn for Copen­hagen.

• Very good point about large-scale in­terfer­ence. If it’s true, it makes me up­date in fa­vor of MWI.

• see my re­sponse to Per­plexed (tl;dr—up­date in fa­vor of MWI)

• while Copen­hagen seems to im­ply that it can­not oc­cur on any scale larger than a hu­man observer

Copen­hagen doesn’t im­ply that. The col­lapse hap­pens as a re­sult of in­ter­ac­tion be­tween the ob­server and the ob­served sys­tem, which can be an atom or an en­tire gal­laxy.

• Copen­hagen doesn’t im­ply that. The col­lapse hap­pens as a re­sult of in­ter­ac­tion be­tween the ob­server and the ob­served sys­tem, which can be an atom or an en­tire gal­laxy.

It has been my ex­pe­rience that there is not con­sen­sus amongst pro­fessed sup­port­ers of the Copen­hagen In­ter­pre­ta­tion about what causes col­lapse, whether an ob­server is in­volved, and what an ob­server is. Given that, I might han­dle this “in­ter­pre­ta­tion” by let­ting it split the prob­a­bil­ity be­tween the pos­si­bil­ities that re­sult from differ­ent con­cepts of col­lapse, and then note that it as­signs less prob­a­bil­ity than Many Wor­lds to the ac­tual out­come.

But, to avoid be­ing un­fair to your par­tic­u­lar un­der­stand­ing of col­lapse, what does prase::Copen­hagen say an ob­server is?

• what does prase::Copen­hagen say an ob­server is?

prase::Copen­hagen::ob­server is prob­a­bly a fun­da­men­tal en­tity, not defin­able with­out use of “ob­serve”, “ob­serv­able” or per­haps “con­scious­ness”. (In fact, prase::Copen­hagen doesn’t nec­es­sar­ily im­ply that the col­lapse is real rather than an effec­tive prac­ti­cal way how to de­scribe re­al­ity; the above holds for those var­i­ants of Copen­hagen which in­sist on the ex­is­tence of an ob­jec­tive col­lapse.)

• If in­ter­ac­tion with a hu­man, un­der the con­di­tions nor­mally pre­sent in a lab­o­ra­tory, are suffi­cient to pre­vent in­terfer­ence, then I see no sen­si­ble in­ter­pre­ta­tion where two wor­lds full of hu­man ob­servers are not pre­vented from in­terfer­ing. Per­haps I should have been more clear—by larger, I meant a scale which in­cludes hu­man ob­servers, such as the en­tire earth, not just one which is much larger than a hu­man.

• I don’t un­der­stand your first sen­tence, and even can’t speci­fi­cally say why. What do you mean by an in­ter­pre­ta­tion where hu­man ob­servers are pre­vented from in­terfer­ing?

On the other hand, I would agree that if the col­lapse is ob­jec­tive, then we should be able to de­tect col­lapse in­duced by ob­servers other than our­selves and ex­per­i­men­tally tell apart ob­servers from non-ob­servers. But the stan­dard use of “Copen­hagen” doesn’t im­ply ob­jec­tive col­lapse, see e.g. Wikipe­dia.

• Sup­pose a sci­en­tist mea­sures some qbit’s state to be 0. My un­der­stand­ing is that, what­ever ver­sion of Copen­hagen you ad­here to, you no longer be­lieve that there is an­other ver­sion of the sci­en­tist some­where who has mea­sured 1. Maybe this is wrong be­cause of the dis­tinc­tion be­tween ob­jec­tive and sub­jec­tive col­lapse, but then I have ab­solutely no idea what dis­t­in­guishes Copen­hagen from many wor­lds. In par­tic­u­lar, I as­sume that Copen­hagen im­plies that there aren’t many wor­lds.

Now, as­sume that af­ter ob­serv­ing this 0, the sci­en­tist told the en­tire world and in­fluenced the course of world events, hav­ing a sig­nifi­cant effect on billions of hu­man ob­servers. Ac­cord­ing to my un­der­stand­ing, Copen­hagen says there is only a sin­gle ver­sion of earth which ac­tu­ally ex­ists—the one in which the sci­en­tist ob­served 0 and told ev­ery­one about it.

Ac­cord­ing to many wor­lds, there are mul­ti­ple ver­sions of earth—one in which the sci­en­tist ob­served 0, and one in which the sci­en­tist ob­served 1. Many wor­lds says that it is pos­si­ble for these differ­ent ver­sions of earth to in­terfere with each other, in ex­actly the same way that the wor­lds where the elec­tron went through the left slit and where the elec­tron went through the right slit can in­terfere. How­ever, be­cause the earth is chock full of phys­i­cal ob­servers to mea­sure which state the earth is in, Copen­hagen seems to say that there is only one ver­sion of the earth and so there cer­tainly can’t be any in­terfer­ence.

• My un­der­stand­ing is that, what­ever ver­sion of Copen­hagen you ad­here to, you no longer be­lieve that there is an­other ver­sion of the sci­en­tist some­where who has mea­sured 1.

Depends. From the out­side ob­server’s point of view, he can be in a su­per­po­si­tion of [has mea­sured 0] and [has mea­sured 1]. From the sci­en­tist’s point of view, the col­lapse has hap­pened.

...but then I have ab­solutely no idea what dis­t­in­guishes Copen­hagen from many wor­lds.

That’s why it’s called “in­ter­pre­ta­tion”. It’s the way how we speak about it, and some untestable state­ments about con­scious­ness per­haps with some philo­soph­i­cal im­pli­ca­tions, which make the whole differ­ence. Of course, an ob­jec­tive col­lapse is a differ­ent thing, but I don’t be­lieve much Copen­hagenists to­day be­lieve in an ob­jec­tive col­lapse.

Ac­cord­ing to many wor­lds, there are mul­ti­ple ver­sions of earth—one in which the sci­en­tist ob­served 0, and one in which the sci­en­tist ob­served 1.

Such state­ment can be mis­lead­ing. There is one ver­sion of Earth, but the in­di­vi­d­ual ob­servers see only cer­tain pro­jec­tions. The differ­ence be­tween MWI and sin­gle-world in­ter­pre­ta­tions is that MWI says that all pro­jec­tions are ex­pe­rienced.

• Do we re­ally want a defi­ni­tion of “com­plex­ity of phys­i­cal the­o­ries” that tells apart the­o­ries mak­ing the same pre­dic­tions?

Yes. As you said, sim­pler the­o­ries have cer­tain ad­van­tages over com­plex the­o­ries, such as pos­si­bil­ity of deeper un­der­stand­ing of what’s go­ing on. Of course, in that case we shouldn’t ex­actly op­ti­mize K-com­plex­ity of their pre­sen­ta­tion, we should op­ti­mize in­for­mal no­tion of sim­plic­ity or ease of un­der­stand­ing. But com­plex­ity of speci­fi­ca­tion is prob­a­bly use­ful ev­i­dence for those other met­rics that are ac­tu­ally use­ful.

The er­ror re­lated to your pre­ced­ing post would be to talk about vary­ing prob­a­bil­ity of differ­ently pre­sented equiv­a­lent the­o­ries, but I don’t re­mem­ber that hap­pen­ing.

• Yeah, I guess the pre­ced­ing post needs some ob­vi­ous amend­ments in light of this post (though the gen­eral point still stands). I hope peo­ple are smart enough to see them any­way.

I just don’t un­der­stand what sense it makes for a perfect Bayesian to dis­t­in­guish be­tween equiv­a­lent the­o­ries. Is it still hon­estly about “de­grees of be­lief”, or is it now about those other in­for­mal prop­er­ties that you list?

• I just don’t un­der­stand what sense it makes for a perfect Bayesian to dis­t­in­guish be­tween equiv­a­lent the­o­ries.

No sense. It’s a cor­rect thing to do if depth of un­der­stand­ing of these the­o­ries is valuable and one is not log­i­cally om­nipo­tent, but us­ing com­plex­ity-lead­ing-to-im­prob­a­bil­ity to jus­tify this prin­ci­ple would be cargo cult Bayesi­anism.

• The prior prob­a­bil­ity of a sim­ple ex­pla­na­tion is in­her­ently greater than the prior prob­a­bil­ity of a com­plex ex­pla­na­tion.

If all ev­i­dence/​ob­ser­va­tion con­firm both ex­pla­na­tions equally, then the sim­ple ex­pla­na­tion still is on the lead: be­cause it started out with a higher prior prob­a­bil­ity.

• Do we re­ally want a defi­ni­tion of “com­plex­ity of phys­i­cal the­o­ries” that tells apart the­o­ries mak­ing the same pre­dic­tions?

If you look at the defi­ni­tion of the Solomonoff prior, you’ll no­tice that it’s ac­tu­ally a weighted sum over many pro­grams that pro­duce the de­sired out­put. This means that a po­ten­tially large num­ber of pro­grams, cor­re­spond­ing in this case to differ­ent for­mu­la­tions of physics, com­bine to pro­duce the fi­nal prob­a­bil­ity of the data set.

So what’s re­ally hap­pen­ing is that all for­mu­la­tions that pro­duce iden­ti­cal pre­dic­tions are effec­tively col­lapsed into an equiv­alence class, which has a higher prob­a­bil­ity than any in­di­vi­d­ual for­mu­la­tion.

• Yeah, I know. The post was about de­con­struct­ing Eliezer’s ar­gu­ment in fa­vor of MWI, not about break­ing the Solomonoff prior.

• Given that I already ex­plained that it makes sense to say the MW for­mu­la­tion con­tributes more prob­a­bil­ity to the equiv­alency class than Col­lapse for­mu­la­tions, it seems that your de­con­struc­tion is de­con­structed.

• I’ll be quite happy if, as a re­sult of my post, pop­u­lar opinion on LW shifts from think­ing that “MWI de­serves a higher prior de­gree of be­lief be­cause it’s sim­pler, which makes a clear-cut case for Bayes over Science” to think­ing that “MWI con­tributes more prob­a­bil­ity to the equiv­alence class than Col­lapse for­mu­la­tions”. Some peo­ple already have it clear, like you. Others don’t.

• OK, as long as we re­mem­ber that MWI and Col­lapse For­mu­la­tions are not re­ally in the same equiv­alence class, and that the im­plied in­visi­ble can have im­pli­ca­tions on util­ity.

• A note: I am pretty sure that paul’s claim (that MWI pre­dicts more in­terfer­ence than typ­i­cal QM) is false, and comes from not con­sid­er­ing en­tan­gle­ment (which is un­der­stand­able, be­cause en­tan­gle­ment is hard). For ex­am­ple, if you col­lapsed the wave­func­tion of an elec­tron in the two slit ex­per­i­ment im­prop­erly (by not keep­ing the state en­tan­gled af­ter go­ing through the slits), you would pre­dict no in­terfer­ence.

• This “en­tan­gle­ment” is just Many Wor­lds ap­plied to a sub­sys­tem rather than the whole uni­verse. If you al­low the en­tire uni­verse to be in­volved in the en­tan­gle­ment, you are re­ally talk­ing about Many Wor­lds by an­other name. If you only al­low sub­sys­tems to be en­tan­gled, you will make differ­ent pre­dic­tions than Many Wor­lds.

• If you al­low the en­tire uni­verse to be in­volved in the en­tan­gle­ment, you are re­ally talk­ing about Many Wor­lds by an­other name.

The other name be­ing “quan­tum me­chan­ics.” :D

Yes, if the typ­i­cal in­ter­pre­ta­tion of QM said any­thing about not al­low­ing n-par­ti­cle en­tan­gled states, it would be in­con­sis­tent with the math of quan­tum me­chan­ics. But it doesn’t, so it isn’t. (Note that some peo­ple have made their own in­ter­pre­ta­tions that vi­o­late this, e.g. con­scious­ness causes col­lapse. They were wrong.)

• Your “fix” seems prob­le­matic too, if it doesn’t al­low be­lief in the im­plied invisible

• Took me more than a day to parse your ob­jec­tion, and it seems to be valid and in­ter­est­ing. Thanks.

• I’m not sure yet.

What does cousin_it mean by

find the short­est al­gorithm that out­puts the same pre­dic­tions.

Does this al­gorithm nec­es­sar­ily model the pre­dic­tions (in any fash­ion), or just list them? If the pre­dic­tions are be­ing mod­eled—then they’ll ei­ther pre­dict or not pre­dict the im­plied in­visi­ble.

If the pre­dic­tions are not be­ing mod­eled—I just don’t see how you can get an al­gorithm to out­put the right list with­out an in­ter­nal model.

This com­ment on this page is rele­vant… For ex­am­ple, I think I agree with this:

In this case, the look-up table is es­sen­tially the pro­gram-that-lists-the-re­sults, and the al­gorithm is the short­est de­scrip­tion of how to get them. The equiv­alence is be­cause, in some kind of sense, pro­cess and re­sults im­ply each other. In my mind, this a bit like some kind of space-like-in­for­ma­tion and time-like-in­for­ma­tion equiv­alence, or as that be­tween a holo­gram and the sur­face it’s pro­jected from.

• “Therein lies the rub. Do we re­ally want a defi­ni­tion of “com­plex­ity of phys­i­cal the­o­ries” that tells apart the­o­ries mak­ing the same pre­dic­tions? ”

Yes.

“Evolu­tion by nat­u­ral se­lec­tion oc­curs” and “God made the world and ev­ery­thing in it, but did so in such a way as to make it look ex­actly as if evolu­tion by nat­u­ral se­lec­tion oc­cured” make the same pre­dic­tions in all situ­a­tions.

You can do perfectly good sci­ence with ei­ther hy­poth­e­sis, but the lat­ter pos­tu­lates an ex­tra en­tity—it’s a less use­ful way of think­ing about things pre­cisely be­cause it’s more com­plex. It adds an ex­tra cog­ni­tive load.

Select­ing the­o­ries by their Kol­mogrov com­plex­ity is just an­other way of say­ing we’re us­ing Oc­cam’s Ra­zor. If you have two the­o­ries with the same ex­plana­tory power and mak­ing the same pre­dic­tions, then you want to use the sim­pler one—not be­cause it’s more likely to be ‘true’, but be­cause it al­lows you to think more clearly.

• then you want to use the sim­pler one—not be­cause it’s more likely to be ‘true’, but be­cause it al­lows you to think more clearly.

Con­grat­u­la­tions, you have now offi­cially bro­ken with Bayesi­anism and be­come a heretic. Your de­gree of be­lief in (prior prob­a­bil­ity of) a hy­poth­e­sis should not de­pend on how clearly it al­lows you to think. Surely you can imag­ine all man­ner of ugly sce­nar­ios if that were the case.

• then you want to use the sim­pler one—not be­cause it’s more likely to be ‘true’, but be­cause it al­lows you to think more clearly.

Con­grat­u­la­tions, you have now offi­cially bro­ken with Bayesi­anism and be­come a heretic. Your de­gree of be­lief in (prior prob­a­bil­ity of) a hy­poth­e­sis should not de­pend on how clearly it al­lows you to think. Surely you can imag­ine all man­ner of ugly sce­nar­ios if that were the case.

Prefer­ring to use a sim­pler the­ory doesn’t re­quire be­liev­ing it to be more prob­a­ble than it is. Ex­pected util­ity max­i­miza­tion to the res­cue.

• Never mind use­ful­ness, it seems to me that “Evolu­tion by nat­u­ral se­lec­tion oc­curs” and “God made the world and ev­ery­thing in it, but did so in such a way as to make it look ex­actly as if evolu­tion by nat­u­ral se­lec­tion oc­cured” are not the same hy­poth­e­sis, that one of them is true and one of them is false, that it is sim­plic­ity that leads us to say which is which, and that we do, in­deed, pre­fer the sim­pler of two the­o­ries that make the same pre­dic­tions, rather than call­ing them the same the­ory.

• While my post was pretty mis­guided (I even wrote an apol­ogy for it), your com­ment looks even more mis­guided to me. In effect, you’re say­ing that be­tween La­grangian and Hamil­to­nian me­chan­ics, at most one can be “true”. And you’re also say­ing that which of them is “true” de­pends on the pro­gram­ming lan­guage we use to en­code them. Are you sure you want to go there?

• In effect, you’re say­ing that be­tween La­grangian and Hamil­to­nian me­chan­ics, at most one can be “true”.

We may even be able to ob­serve which one. Ac­tu­ally, I am pretty sure that if I looked closely at QM and these two for­mu­la­tions, I would go with Hamil­to­nian me­chan­ics.

• Ah, but which Hamil­to­nian me­chan­ics is the true one: the one that says real num­bers are in­finite bi­nary ex­pan­sions, or the one that says real num­bers are Dedekind cuts? I dunno, your way of think­ing makes me queasy.

• Sorry—I wrote an in­cor­rect re­ply and deleted it. Let me think some more.

• That point of view has far-reach­ing im­pli­ca­tions that make me un­com­fortable. Con­sider two phys­i­cal the­o­ries that are equiv­a­lent in ev­ery re­spect, ex­cept they use differ­ent defi­ni­tions of real num­bers. So they have a com­mon part C, and the­ory A is the con­junc­tion of C with “real num­bers are Dedekind cuts”, while the­ory B is the con­junc­tion of C with “real num­bers are in­finite bi­nary ex­pan­sions”. Ac­cord­ing to your and Eliezer’s point of view as I un­der­stand it right now, at most one of the two the­o­ries can be “true”. So if C (the com­mon part) is “true”, then or­di­nary logic tells us that at most one defi­ni­tion of the real num­bers can be “true”. Are you re­ally, re­ally sure you want to go there?

• I think there’s a dis­tinc­tion that should be made ex­plicit be­tween “a the­ory” and “our hu­man men­tal model of a the­ory.” The the­ory is the same, but we right­fully try to in­ter­pret it in the sim­plest pos­si­ble way, to make it clearer to think about.

Usu­ally, two differ­ent men­tal mod­els nec­es­sar­ily im­ply two differ­ent the­o­ries, so it’s easy to con­flate the two, but some­times (in math­e­mat­ics es­pe­cially) that’s just not true.

• Hmmm. But the very first post­ing in the se­quences says some­thing about “mak­ing your be­liefs pay rent in ex­pected ex­pe­rience”. If you don’t ex­pect differ­ent ex­pe­riences in choos­ing be­tween the the­o­ries, it seems that you are mak­ing an un­falsifi­able claim.

I’m not to­tally con­vinced that the two the­o­ries do not make differ­ent pre­dic­tions in some sense. The evolu­tion the­ory pretty much pre­dicts that we are not go­ing to see a Rap­ture any time soon, whereas the God the­ory leaves the ques­tion open. Not ex­actly “differ­ent pre­dic­tions”, but some­thing close.

• Both the­o­ries are try­ing to pay rent on the same house; that’s the prob­lem here, which is quite dis­tinct from nei­ther the­ory pay­ing rent at all.

• Clever. But …

If the­o­ries A and B pay rent on the same house, then the the­ory (A OR B) pays enough rent so that the stronger the­ory A need pay no ad­di­tional rent at all. Yet you seem to pre­fer A to B, and also to (A OR B).

• (A OR B) is more prob­a­ble than A, but if A is much more prob­a­ble than B, then say­ing “(A OR B)” in­stead of “A” is leav­ing out in­for­ma­tion.

• Let’s say A = (MWI is cor­rect) and B = (Copen­hagen)

The equiv­a­lent of “A OR B” is the state­ment “ei­ther Copen­hagen or MWI is cor­rect”, and I’m sure ev­ery­one here as­signs “A OR B” a higher prior than ei­ther A or B sep­a­rately.

But that’s not re­ally a the­ory, it’s a dis­junc­tion be­tween two differ­ent the­o­ries, so of­course we want to un­der­stand which of the two is ac­tu­ally the cor­rect one. Not sure what your ob­jec­tion is here.

EDITED to cor­rect a wrong term.

• Not sure what your ob­jec­tion is here.

I’m not sure I have one. It is just a lit­tle puz­zling how we might rec­on­cile two things:

• EY’s very at­trac­tive in­tu­ition that of two the­o­ries mak­ing the same pre­dic­tions, one is true and the other … what? False? Wrong? Well, … “not quite so true”.

• The tra­di­tion in Bayesi­anism and stan­dard ra­tio­nal­ity (and log­i­cal pos­i­tivism, for that mat­ter) that the truth of a state­ment is to be found through its ob­serv­able con­se­quences.

ETA: Bayes’s rule only deals with the frac­tion of re­al­ity-space spanned by a sen­tence, never with the num­ber of char­ac­ters needed to ex­press the sen­tence.

• There’s a use­ful heuris­tic to solve tricky ques­tions about “truths” and “be­liefs”: re­duce them to ques­tions about de­ci­sions and util­ities. For ex­am­ple, the Sleep­ing Beauty prob­lem is very puz­zling if you in­sist on think­ing in terms of sub­jec­tive prob­a­bil­ities, but be­comes triv­ial once you in­tro­duce any pay­off struc­ture. Maybe we could ap­ply this heuris­tic here? Believ­ing in one for­mu­la­tion of a the­ory over a differ­ent equiv­a­lent for­mu­la­tion isn’t likely to win a Bayesian rea­soner many dol­lars, no mat­ter what ob­ser­va­tions come in.

• Believ­ing in one for­mu­la­tion of a the­ory over a differ­ent equiv­a­lent for­mu­la­tion isn’t likely to win a Bayesian rea­soner many dol­lars, no mat­ter what ob­ser­va­tions come in.

Ac­tu­ally, it might help a rea­soner sad­dled with bounded ra­tio­nal­ity. One the­ory might re­quire less com­pu­ta­tion to get from the­ory to pre­dic­tion, or it might re­quire less mem­ory re­sources to store. Hav­ing a fast, easy-to-use the­ory can be like money in the bank to some­one who needs lots and lots of pre­dic­tions.

It might be in­ter­est­ing to look at that idea some­one here was talk­ing about that merged ideas from Zadeh’s fuzzy logic with Bayesi­anism. In­stead of sim­ple Bayesian prob­a­bil­ities which can be up­dated in­stan­ta­neously, we may need to think of fuzzy prob­a­bil­ities which grow sharper as we de­vote cog­ni­tive re­sources to re­fin­ing them. But with a good, sim­ple the­ory we can get a sharper pic­ture quicker.

• I don’t un­der­stand your point about bounded ra­tio­nal­ity. If you know the­ory X is equiv­a­lent to the­ory Y, you can be­lieve in X more, but use Y for calcu­la­tions.

• Thats the defi­ni­tion of a free-float­ing be­lief isn’t it? If you only have so much com­pu­ta­tional re­sources even stor­ing the­ory X in your mem­ory is a waste of space.

• I think cousin_it’s point was that if you have a prefer­ence for both quickly solv­ing prob­lems and know­ing the true na­ture of things, then if the­ory X tells you the true na­ture of things but the­ory Y is a hack­job ap­prox­i­ma­tion that nev­er­the­less gives you the an­swer you need much faster (in com­puter terms, say, a simu­la­tion of the ac­tual event vs a monte-carlo run with the prob­a­bil­ities just plugged in) then it might be pos­i­tive util­ity even un­der bounded ra­tio­nal­ity to keep both the­ory X and the­ory Y.

edit: the as­sump­tion is that we have at least mild prefer­ences for both and the bounds on our ra­tio­nal­ity are suffi­ciently high that this is the preferred op­tion for most of sci­ence).

• It’s one thing if you want to calcu­late a the­ory that is sim­pler be­cause you don’t have a need for perfect ac­cu­racy. New­ton is good enough for a large frac­tion of physics calcu­la­tions and so even though it is strictly wrong I imag­ine most rea­son­ers would have need to keep it handy be­cause it is sim­pler. But if you have two em­piri­cally equiv­a­lent and com­plete the­o­ries X and Y, and X is com­pu­ta­tion­ally sim­pler so you rely on X for calcu­lat­ing pre­dic­tions, it seems to me you be­lieve x. What would say­ing “No, ac­tu­ally I be­lieve in Y not X” even mean in this con­text? The state­ment is un­con­nected to an­ti­ci­pated ex­pe­rience and any con­ceiv­able pay­off struc­ture.

Bet­ter yet, taboo “be­lief”. Say you are an agent with a pro­gram that al­lows you to calcu­late, based on your ob­ser­va­tions, what your ob­ser­va­tions will be in the fu­ture con­tin­gent on var­i­ous ac­tions. You have an­other pro­gram that ranks those fu­tures ac­cord­ing to a util­ity func­tion. What would it mean to add “be­lief” to this pic­ture?

• Your first para­graph looks mis­guided to me: does it im­ply we should “be­lieve” ma­trix mul­ti­pli­ca­tion is defined by the naive al­gorithm for small n, and the Strassen and Cop­per­smith-Wino­grad al­gorithms for larger val­ues of n? Your sec­ond para­graph, on the other hand, makes ex­actly the point I was try­ing to make in the origi­nal post: we can as­sign de­grees of be­lief to equiv­alence classes of the­o­ries that give the same ob­serv­able pre­dic­tions.

• For ex­am­ple, the Sleep­ing Beauty prob­lem is very puz­zling if you in­sist on think­ing in terms of sub­jec­tive prob­a­bil­ities, but be­comes com­pletely clear once you in­tro­duce a pay­off struc­ture.

Heh, I was just work­ing on a post on that point.

Believ­ing in one for­mu­la­tion of a the­ory over a differ­ent equiv­a­lent for­mu­la­tion isn’t likely to win a Bayesian rea­soner many dol­lars, no mat­ter what ob­ser­va­tions come in. There­fore the rea­soner should as­sign de­grees of be­lief to equiv­alence classes of the­o­ries rather than in­di­vi­d­ual the­o­ries.

I agree that that is true about equiv­a­lent for­mu­la­tions, liter­ally iso­mor­phic the­o­ries (as in this com­ment), but is that re­ally the case about MWI vs. Copen­hagen? Col­lapse is claimed as some­thing that’s ac­tu­ally hap­pen­ing out there in re­al­ity, not just as an­other way of look­ing at the same thing. Doesn’t it have to be eval­u­ated as a hy­poth­e­sis on its own, such that the con­junc­tion (MWI & Col­lapse) is nec­es­sar­ily less prob­a­ble than just MWI?

• Ex­cept the whole quan­tum suicide thing does cre­ate pay­off struc­tures. In de­ter­min­ing weather or not to play a game of Quan­tum Rus­sian Roulette you take your es­ti­mated win­nings for play­ing if MWI and Quan­tum im­mor­tal­ity is true and your es­ti­mated win­nings if MWI or Quan­tum im­mor­tal­ity is false and weigh them ac­cord­ing to the prob­a­bil­ity you as­sign each the­ory.

(ETA: But this seems to be a quirky fea­ture of QM in­ter­pre­ta­tion, not a fea­ture of em­piri­cally equiv­a­lent the­o­ries gen­er­ally.)

(ETA 2: And it is a quirky fea­ture of QM in­ter­pre­ta­tion be­cause MWI+Quan­tum Im­mor­tal­ity is em­piri­cally equiv­a­lent to sin­gle world the­o­ries is a re­ally quirky way.)

• IMO quan­tum suicide/​im­mor­tal­ity is so mys­te­ri­ous that it can’t sup­port any definite con­clu­sions about the topic we’re dis­cussing. I’m be­gin­ning to view it as a sort of thread-kil­ler, like “con­scious­ness”. See a com­ment that men­tions QI, col­lapse the whole thread be­cause you know it’s not gonna make you hap­pier.

• I agree that nei­ther we nor any­one else do a good job dis­cussing it. It seems like a pretty im­por­tant is­sue though.

• EY’s very at­trac­tive in­tu­ition that of two the­o­ries mak­ing the same pre­dic­tions, one is true and the other … what? False? Wrong? Well, … “not quite so true”.

“More Wrong”. :)

I can think of two cir­cum­stances un­der which two the­o­ries would make the same pre­dic­tions (that is, where they’d sys­tem­at­i­cally make the same pre­dic­tions, un­der all pos­si­ble cir­cum­stances un­der which they could be called upon to do so):

• They are math­e­mat­i­cally iso­mor­phic — in this case I would say they are the same the­ory.

• They con­tain iso­mor­phic sub­struc­tures that are re­spon­si­ble for the iden­ti­cal pre­dic­tions. In this case, the part out­side what’s needed to ac­tu­ally gen­er­ate the pre­dic­tions counts as ex­tra de­tail, and by the con­junc­tion rule, this re­duces the prob­a­bil­ity of the “outer” hy­poth­e­sis.

The lat­ter is where col­lapse vs. MWI falls, and where “we don’t know why the fun­da­men­tal laws of physics are what they are” vs. “God de­signed the fun­da­men­tal laws of physics, and we don’t know why there’s a God” falls, etc.

• The tra­di­tion in Bayesi­anism and stan­dard ra­tio­nal­ity (and log­i­cal pos­i­tivism, for that mat­ter) that the truth of a state­ment is to be found through its ob­serv­able con­se­quences.

Since when is that the Bayesian tra­di­tion? Ci­ta­tion needed.

• the truth of a state­ment is to be found through its ob­serv­able con­se­quences.

Since when?

Well, I guess I am tak­ing “ob­serv­able con­se­quences” to be some­thing closely re­lated to P(E|H)/​P(E). And I am tak­ing “the truth of a state­ment” to have some­thing to do with P(H|E) ad­justed for any bias that might have been pre­sent in the prior P(H).

I’m afraid this ex­pla­na­tion is all the cita­tion I can offer. I would be happy to hear your opinion along the lines of “That ain’t ‘truth’. ‘Truth’ is to a Bayesian”

• Ob­serv­able con­se­quences are part of what con­trols the plau­si­bil­ity of a state­ment, but not its truth. An un­ob­serv­able truth can still be a truth. Things out­side our past light cone ex­ist de­spite be­ing un­ob­serv­able. Ask­ing about a claim about some un­ob­serv­able “Then how can we know whether it’s true?” is ir­rele­vant to eval­u­at­ing whether it is the sort of thing that could be a truth be­cause we’re not talk­ing about our­selves. Con­fus­ing truths with be­liefs — even care­fully-ac­quired ac­cu­rate be­liefs — is mind pro­jec­tion.

I’m afraid this ex­pla­na­tion is all the cita­tion I can offer. I would be happy to hear your opinion along the lines of “That ain’t ‘truth’. ‘Truth’ is to a Bayesian”

I can’t speak for ev­ery­one who’d call them­selves Bayesi­ans, but I would say: There is a thing called re­al­ity, which causes our ex­pe­riences and a lot of other things, char­ac­ter­ized by its abil­ity to not always do what we want or ex­pect. A state­ment is true to the ex­tent that it mir­rors some as­pect of re­al­ity (or some other struc­ture if speci­fied).

• Ob­serv­able con­se­quences are part of what con­trols the plau­si­bil­ity of a state­ment, but not its truth. An un­ob­serv­able truth can still be a truth.

There is a thing called re­al­ity, which causes our ex­pe­riences and a lot of other things, char­ac­ter­ized by its abil­ity to not always do what we want or ex­pect.

If we’re go­ing to dis­t­in­guish ‘truth’ from our ‘ob­ser­va­tions’ then we need to be able to define ‘re­al­ity’ as some­thing other than ‘ex­pe­rience gen­er­a­tor’ (or else de­cou­ple truth and re­al­ity).

• Per­son­ally, I sus­pect that we re­ally need to think of re­al­ity as some­thing other than an ex­pe­rience gen­er­a­tor. What we can ex­tract out of re­al­ity is only half of the story. The other half is the stuff we put in so as to cre­ate re­al­ity.

This is not a fully worked out philo­soph­i­cal po­si­tion, but I do have some slo­gans:

• You can’t do QM with only kets and no bras.

• You can’t do Gentzen nat­u­ral de­duc­tion with rules of elimi­na­tion, but no rules of in­tro­duc­tion.

• You can’t write a pro­gram with GOTOs, but no COMEFROMs.

(That last slo­gan prob­a­bly needs some work. Maybe I’ll try some­thing in­volv­ing causes and effects.)

• How do you ad­ju­di­cate a wa­ger with­out ob­serv­able con­se­quences?

• Well the sec­ond of those things already has very se­ri­ous prob­lems. See for ex­am­ple Quine’s Con­fir­ma­tion Holism. We’ve know for a long time that our the­o­ries are un­der-de­ter­mined by our ob­ser­va­tions and that we need some other way of ad­ju­di­cat­ing em­piri­cally equiv­a­lent the­o­ries. This was our ba­sis for prefer­ring Spe­cial Rel­a­tivity over Lorentz Ether The­ory. Par­si­mony seems like one im­por­tant crite­ria but in­volves two ques­tions:

1. One man’s sim­ple seems like an­other man’s com­plex. How do you rigor­ously iden­tify the more par­si­mo­nious be­tween two hy­pothe­ses. Lots of peo­ple thing God is a very sim­ple hy­poth­e­sis. The most seem­ingly pro­duc­tive ap­proach that I know of is the al­gorith­mic com­plex­ity one that is pop­u­lar here.

2. Is par­si­mony im­por­tant be­cause par­si­mo­nious the­o­ries are more likely be ‘real’ or is the is­sue re­ally one of de­vel­op­ing clear and helpful pre­dic­tion gen­er­at­ing de­vices?

The way the al­gorith­mic prob­a­bil­ity stuff has been lev­er­aged is by build­ing can­di­dates for uni­ver­sal pri­ors. But this doesn’t seem like the right way to do it. Beliefs are about an­ti­ci­pat­ing fu­ture ex­pe­rience so they should take the form of ’Sen­sory ex­pe­rience x will oc­cur at time t” (or some­thing re­ducible to this). The­o­ries aren’t like this. The­o­ries are frame­works that let us take some sen­sory ex­pe­rience and gen­er­ate be­liefs about our fu­ture sen­sory ex­pe­riences.

So I’m not sure it makes sense to have be­liefs dis­t­in­guish­ing em­piri­cally iden­ti­cal the­o­ries. That seems like a kind of cat­e­gory er­ror- a map-ter­ri­tory con­fu­sion. The ques­tion is, what do we do with this al­gorith­mic com­plex­ity stuff that was so promis­ing. I think we still have good rea­sons to be think­ing cleanly about com­pli­cated sci­ence- the QM in­ter­pre­ta­tion de­bate isn’t to­tally ir­rele­vant. But it isn’t ob­vi­ous al­gorith­mic sim­plic­ity is what we want out of our the­o­ries (nor is it clear that what we want is the same thing other agents might want out of their the­o­ries). (ETA: Though of course K-com­plex­ity might still be helpful in mak­ing pre­dic­tions be­tween two pos­si­ble fu­tures that are em­piri­cally dis­tinct. For ex­am­ple, we can as­sign a low prob­a­bil­ity to find­ing ev­i­dence of a moon land­ing con­spir­acy since the the­ory that would pre­dict dis­cov­er­ing such ev­i­dence is un­par­si­mo­nious. But if that is the case, if the­o­ries can be ruled im­prob­a­ble on the ba­sis of the struc­ture of the the­ory alone why can we only do this with em­piri­cally dis­tinct the­o­ries? Shouldn’t all the­o­ries be un­der­stand­able in this way?)

• “Bayes’s rule only deals with the frac­tion of re­al­ity-space spanned by a sen­tence”

Well, that’s the thing: re­al­ity-space doesn’t con­cern just our ob­ser­va­tions of the uni­verse. If two differ­ent the­o­ries make the same pre­dic­tions about our ob­ser­va­tions but dis­agree about which mechanism pro­duces those events we ob­serve, those are two differ­ent slices of re­al­ity-space.

• Thanks, your com­ment is a very clear for­mu­la­tion of the rea­son why I wrote the post. Prob­a­bly even bet­ter than the post it­self.

I’m halfway tempted to write yet an­other post about com­plex­ity (maybe in the dis­cus­sion area), sum­ma­riz­ing all the differ­ent po­si­tions ex­pressed here in the com­ments and bring­ing out the key ques­tions. The last 24 hours have been a very ed­u­ca­tional ex­pe­rience for me. Or maybe let some­one else do it, be­cause I don’t want to spam LW.

• But that’s not re­ally a the­ory, it’s a con­juc­tion be­tween two differ­ent the­o­ries,

It’s ac­tu­ally the dis­junc­tion.

• Yes, apolo­gies. Fixed above.

• Mak­ing the same pre­dic­tions means mak­ing the same as­sign­ments of prob­a­bil­ities to out­comes.

• Which brings us back to an is­sue which I was de­bat­ing here a cou­ple of weeks ago: Is there a differ­ence be­tween an event be­ing im­pos­si­ble, and an event be­ing of mea­sure zero?

Ortho­dox Bayesi­anism says there is no differ­ence and strongly ad­vises against think­ing ei­ther to be the case. I’m won­der­ing whether there isn’t some way to make the idea work that there is a dis­tinc­tion to be made—that some things are com­pletely im­pos­si­ble given a the­ory, while other things are merely of in­finites­i­mal prob­a­bil­ity.

• There’s a pro­posal to use sur­real num­bers for util­ities. Such an ap­proach was used for go by Con­way.

• It might be more ac­cu­rate to say that sur­real num­bers are a sub­set of the num­bers that were in­vented by Con­way to de­scribe the value of game po­si­tions.

• In­ter­est­ing sug­ges­tion. I ought to look into that. Thx.

• When there is a testable phys­i­cal differ­ence be­tween hy­pothe­ses, we want the one that makes the cor­rect pre­dic­tion.

When there is no testable phys­i­cal differ­ence be­tween hy­pothe­ses, we want to use the one that makes it eas­iest to make the cor­rect pre­dic­tion. By defi­ni­tion, we can never get a pre­dic­tion that wouldn’t have hap­pened were we us­ing the other hy­poth­e­sis, but we’ll get that pre­dic­tion quicker. Nei­ther hy­poth­e­sis can be said to be ‘the way the world re­ally is’ be­cause there’s no way to dis­t­in­guish be­tween them, but the sim­pler hy­poth­e­sis is more use­ful.

• Wha? Then you must or­der the equiv­a­lent the­o­ries by run­ning time, not code length. The two are fre­quently op­posed: for ex­am­ple, the fastest known al­gorithm for ma­trix mul­ti­pli­ca­tion (in the big-O sense) is very com­plex com­pared to the naive one. In short, I feel you’re only dig­ging your­self deeper into the hole of heresy.

• I think there’s a differ­ence be­tween look­ing at a the­ory as data ver­sus look­ing at it as code.

You look at a the­ory as code when you need to use the the­ory to pre­dict the fu­ture of some­thing it de­scribes. (E.g., will it rain.) For this pur­pose, the­o­ries that gen­er­ate the same pre­dic­tions are equiv­a­lent, you don’t care about their size. In fact, even the­o­ries with differ­ent pre­dic­tions can be con­sid­ered equiv­a­lent, as long as their pre­dic­tions are close enough for your pur­pose. (See New­to­nian vs. rel­a­tivis­tic physics ap­plied to pre­dict­ing kitchen-sink perfor­mance.) You do care about how fast you can run them, though.

How­ever, you look at a the­ory as data when you need to rea­son about the­o­ries, and “make pre­dic­tions” about them, par­tic­u­larly un­known the­o­ries re­lated to known ones. As long as two the­o­ries make ex­actly the same pre­dic­tions, you don’t have much rea­son to rea­son about them. How­ever, if they pre­dict differ­ently for some­thing you haven’t tested yet, but will test in the fu­ture, and you need to take an ac­tion now that has differ­ent out­comes de­pend­ing on the re­sult of the fu­ture test (sim­ple ex­am­ple: a bet), then you need to try to guess which is more likely.

You need some­thing like a meta-the­ory that pre­dicts which of the two is more likely to be true. Oc­cam’s ra­zor is one of those meta-the­o­ries.

Think­ing about it more, this isn’t quite a dis­agree­ment to the post im­me­di­ately above; it’s not im­me­di­ately ob­vi­ous to me that a sim­pler the­ory is eas­ier to rea­son about (though in­tu­ition says it should be). But I don’t think Oc­cam’s ra­zor is about how easy it is to rea­son about the­o­ries, it just claims sim­pler ones are more likely. (Although one could jus­tify it like this: take an in­com­plete the­ory; add one de­tail; add an­other de­tail; on each step you have to pick be­tween many de­tails you might add, so the more de­tails you add you’re more likely to pick the wrong one (re­mem­ber you haven’t tested the suc­ces­sive the­o­ries yet); thus, the more com­plex your the­ory the like­lier you are to be wrong.)

• Well, firstly, who said I cared at all about ‘heresy’? I’m not re­ply­ing here in or­der to demon­strate my ad­her­ence to the First Church Of Bayes or some­thing...

And while there are, ob­vi­ously, oc­ca­sions where or­der­ing by run­ning time and code length are op­posed, in gen­eral when com­par­ing two ar­bi­trary pro­grams which gen­er­ate the same out­put, the longer one will also take longer. This is ob­vi­ous when you con­sider it—if you have an ar­bi­trary pro­gram X from the space of all pro­grams that gen­er­ate an out­put Y, there can only be a finite num­ber of pro­grams that gen­er­ate that out­put more quickly. How­ever, there are an in­finite num­ber of pro­grams in the sam­ple space that will gen­er­ate the out­put more slowly, and that are also longer than X—just keep adding an ex­tra ‘sleep 1’ be­fore it prints the out­put, to take a triv­ial ex­am­ple.

In gen­eral, the longer the pro­gram, the more op­er­a­tions it performs and the longer it takes, when you’re sam­pling from the space of all pos­si­ble pro­grams. So while run time and code length aren’t perfectly cor­re­lated, they’re a very de­cent proxy for each other.

• if you have an ar­bi­trary pro­gram X from the space of all pro­grams that gen­er­ate an out­put Y, there can only be a finite num­ber of pro­grams that gen­er­ate that out­put more quickly.

Amus­ingly, this state­ment is false. If a pro­gram Z is faster than X, then there ex­ist in­finitely many ver­sions of Z that also run faster than X: just add some never-ex­e­cuted code un­der an if(false) branch. I’m not sure whether your over­all ar­gu­ment can be sal­vaged.

• You’re quite cor­rect, there. I was only in­clud­ing code paths that can ever ac­tu­ally be ex­e­cuted, in the same way I wouldn’t count com­ments as part of the pro­gram. This seems to me to be the cor­rect thing to do, and I be­lieve one could come up with some more rigor­ous rea­son­ing along the lines of my pre­vi­ous com­ment, but I’m too tired right now to do so. I’ll think about this...

• I was only in­clud­ing code paths that can ever ac­tu­ally be ex­e­cuted...

Wouldn’t a meta-al­gorithm that de­ter­mines which paths are ex­e­cutable in a given al­gorithm nec­es­sar­ily not be able to do so for ev­ery pos­si­ble al­gorithm un­less it was func­tion­ally equiv­a­lent to a halt­ing or­a­cle?

I’m not sure how prob­le­matic this is to your idea, but it’s one ad­van­tage that the sim­pler sys­tem of just count­ing to­tal lines has.

• The length of the pro­gram de­scrip­tion is not re­ally the mea­sure of how easy it is to make a cor­rect pre­dic­tion. In fact, the short­est pro­gram for pre­dict­ing is al­most never the one you should use to make pre­dic­tions in prac­tice, pre­cisely be­cause it is nor­mally quite slow. It is also very rarely the pro­gram which is eas­iest to ma­nipu­late men­tally, since short pro­grams tend to be very hard for hu­mans to rea­son about.

• Like PaulFChis­ti­ano said, the short­est ac­cu­rate pro­gram isn’t par­tic­u­larly use­ful, but its pre­dic­tive model is more a pri­ori prob­a­ble ac­cord­ing to the uni­ver­sal /​ Oc­camian prior.

It’s re­ally hard (and un­com­putable) to dis­cover, un­der­stand, and ver­ify the short­est pro­gram that com­putes a cer­tain in­put->pre­dic­tion map­ping. But we use the “short­est equiv­a­lent pro­gram” con­cept to judge which hu­man-un­der­stand­able pro­gram is more a pri­ori prob­a­ble.

• When there is no testable phys­i­cal differ­ence be­tween hy­pothe­ses, we want to use the one that makes it eas­iest to make the cor­rect pre­dic­tion.

Yes, we want to use the hy­poth­e­sis that is eas­iest to use. But if we use it, does that com­mit us to ‘be­liev­ing’ in it? In the case of no testable phys­i­cal differ­ence be­tween hy­pothe­ses, I pro­pose that some­one has no obli­ga­tion to be­lieve (or ad­mit they be­lieve) that par­tic­u­lar the­ory in­stead of an­other one with the same pre­dic­tions.

I en­thu­si­as­ti­cally pro­pose that we say we ‘have’ a be­lief only when we use or ap­ply a be­lief for which there is an em­piri­cal differ­ence in the pre­dic­tions of the be­lief com­pared to the non-be­lief. Alter­na­tively, we can use some other word in­stead of be­lief, that will serve to carry this more rele­vant dis­tinc­tion.

(Later: I re­al­ize this com­ment is ac­tu­ally di­rected at cousin_it, since he was the one that wrote, ‘your de­gree of be­lief in (prior prob­a­bil­ity of) a hy­poth­e­sis should not de­pend on how clearly it al­lows you to think’. I also think I may have re­it­er­ated what Vladimir_Nesov wrote here.)

• “Evolu­tion by nat­u­ral se­lec­tion oc­curs” and “God made the world and ev­ery­thing in it, but did so in such a way as to make it look ex­actly as if evolu­tion by nat­u­ral se­lec­tion oc­cured” make the same pre­dic­tions in all situ­a­tions.

I just wanted to make a com­ment here that the lat­ter hy­poth­e­sis is more com­plex be­cause of the ex­tra things that are pack­aged into the word “God”.

“Some­thing” mak­ing the world and ev­ery­thing in it and mak­ing it look like evolu­tion isn’t a hy­poth­e­sis of higher com­plex­ity … it’s just the same hy­poth­e­sis again, right? I feel like they’re the same hy­poth­e­sis to a large ex­tent be­cause the pre­dic­tions are the same, and also be­cause “some­thing”, “mak­ing” and “mak­ing it look like” are all vague enough to fill in with what­ever is ac­tu­ally the case.

• Two ar­gu­ments—or maybe two for­mu­la­tions of the one ar­gu­ment—for com­plex­ity re­duc­ing prob­a­bil­ity, and I think the jux­ta­po­si­tion ex­plains why it doesn’t feel like com­plex­ity should be a straight-up penalty for a the­ory.

The hu­man-level ar­gu­ment for com­plex­ity re­duc­ing prob­a­bil­ity some­thing like A∩B is more prob­a­ble than A∩B∩C be­cause the sec­ond has three fault-lines, so to speak, and the first only has two, so the sec­ond is more likely to crack. edit: equally or more likely, not strictly more likely. (For en­g­ineers out there; I have found this metaphor to be in­valuable both in spot­ting this in con­ver­sa­tion, and ex­plain­ing this in con­ver­sa­tion to peo­ple). As byrnema noted down be­low, that doesn’t seem ap­pli­ca­ble here, at least not in the di­rect sim­pler = bet­ter way, es­pe­cially when hav­ing the same pre­dic­tions seems to in­di­cate that A, B, and C are all right.

The for­mal ar­gu­ment for com­plex­ity penalty (and this is philos­o­phy, so bear with me) is that a pri­ori, hav­ing ab­solutely no ex­pe­riences about the uni­verse so that all premises are equally likely (with noth­ing to priv­ilege any of them, they de­fault… the uni­ver­sal prior, if you like) - the the­ory with the fewest con­junc­tions of premises is the most likely by virtue of prob­a­bil­ity the­ory. Now, we are re­stricted in our ob­ser­va­tions, be­cause they don’t tell us what ac­tu­ally is; they merely tell us that any­thing that pre­dicts the out­come is, and ev­ery­thing that doesn’t pre­dict the out­come, isn’t. This in­cludes ad­hoc the­o­ries and over­com­pli­cated the­o­ries like “Odin made Horus made God made the uni­verse as we know it.” How­ever, we can ex­tend that pre­vi­ous ar­gu­ment: Given that our ob­ser­va­tions have nar­rowed the uni­verse as we know it to this sec­tion of hy­pothe­ses, we have no ex­pe­riences that say some­thing about any of the hy­pothe­ses in that sec­tion. So, a pri­ori, all pos­si­ble premises within that sec­tion are equally likely. So we should choose the one with the least con­junc­tions of premises, ac­cord­ing to prob­a­bil­ity the­ory.

This doesn’t re­ally get to the heart of the mat­ter ad­dressed in the post, but it does jus­tify a form of com­plex­ity-as-penalty that has some bear­ing: namely, that if Hamil­to­nian re­quires less premises than La­grangian, and pre­dic­tions bear out both of these sys­tems out equally well, Hamil­to­nian is more prob­a­ble, be­cause it is less likely to be wrong due to a false premise some­where in the area we haven’t yet ac­cessed. (In for­mal logic, La­grangian is prob­a­bly us­ing some premise it doesn’t need to).

• The hu­man-level ar­gu­ment for com­plex­ity re­duc­ing prob­a­bil­ity some­thing like A∩B is more prob­a­ble than A∩B∩C be­cause the sec­ond has three fault-lines, so to speak, and the first only has two, so the sec­ond is more likely to crack.

Strictly speak­ing, the Pr(A∩B) ≥ Pr(A∩B∩C), not Pr(A∩B) > Pr(A∩B∩C). Other­wise, ex­cel­lent post.

• Oh dear. Thanks for point­ing that out! Go­ing to fix it.

• Uu­uh­hhh, wait, there’s some­thing wrong with your post. A sim­ple log­i­cal state­ment can im­ply a com­plex-look­ing log­i­cal state­ment, right? Imag­ine that C is a very sim­ple state­ment that im­plies B which is very com­plex. Then A∩B∩C is log­i­cally equiv­a­lent to A∩C, which is sim­pler than A∩B be­cause C is sim­pler than B by as­sump­tion. Whoops.

• You can make a state­ment more com­plex by adding more con­junc­tions or by adding more dis­junc­tions. In gen­eral, the com­plex­ity of a state­ment about the world has no di­rect bear­ing on the prior prob­a­bil­ity we ought to as­sign to it. My pre­vi­ous post (linked from this one) talks about that.

• find the short­est al­gorithm that out­puts the same pre­dic­tions.

Pre­dic­tion mak­ing is not a fun­da­men­tal at­tribute that hy­pothe­ses have. What dis­t­in­guishes hy­pothe­ses is what they are say­ing is re­ally go­ing on. We use that to make pre­dic­tions.

The wa­ters get muddy when deal­ing with fun­da­men­tal the­o­ries of the uni­verse. In a more gen­eral case: If we have two the­o­ries which lead to iden­ti­cal pre­dic­tions of the be­hav­ior of an im­pen­e­tra­ble black box, but say differ­ent things about the in­te­rior, then we should choose the sim­pler one. If at some point in the fu­ture we figure out how to open the black box, then the things you had la­beled im­ple­men­ta­tion de­tails might be lead­ing to pre­dic­tions.

I don’t think we should aban­don that just be­cause we hit a black box that ap­pears fun­da­men­tally im­pen­e­tra­ble.

• we should choose the sim­pler one

Why do you use the ad­jec­tive ‘sim­pler’? I un­der­stand that this isn’t just you, but the com­mon term for this con­text. But we re­ally mean ‘more prob­a­ble’, cor­rect? In which case, why don’t we just say, ‘more prob­a­ble’?

I’m not sure what ‘sim­pler’ means but I don’t think the re­la­tion­ship be­tween ‘sim­ple’ and ‘prob­a­ble’ is straight-for­ward—ex­cept when the more com­plex thing is a sub­set of the more sim­ple thing. That is, in the usual pro­vided ex­am­ple that A∩B is more prob­a­ble than A∩B∩C.

• Sim­pler is not always more prob­a­ble, it’s just some­thing with which to build your pri­ors.

If you have two the­o­ries that make differ­ent but similar pre­dic­tions of noisy data, the one that fits the data bet­ter might be the more prob­a­ble, even if it’s vastly more com­plex.

• Sup­pose, coun­ter­fac­tu­ally, that Many Wor­lds QM and Col­lapse QM re­ally always made the same pre­dic­tions, and so you want to say they are both the same the­ory QM. It still makes sense to ask what is the com­plex­ity of Many Wor­lds QM and how much prob­a­bil­ity does it con­tribute to QM, and what is the com­plex­ity of Col­lapse QM and how much prob­a­bil­ity does it con­tribute to QM. It even makes sense to say that Many Wor­lds QM has a strictly smaller com­plex­ity, and con­tributes more prob­a­bil­ity, and is the bet­ter for­mu­la­tion.

• It still makes sense to ask what is the com­plex­ity of Many Wor­lds QM and how much prob­a­bil­ity does it con­tribute to QM, and what is the com­plex­ity of Col­lapse QM and how much prob­a­bil­ity does it con­tribute to QM.

You can of course in­tro­duce the uni­ver­sal prior over equiv­a­lent for­mu­la­tions of a given the­ory, and state which for­mu­la­tions weigh how much ac­cord­ing to this prior, but I don’t see in what way this is a nat­u­ral struc­ture to con­sider, and what ques­tions it al­lows to un­der­stand bet­ter.

• It seems you want to define the com­plex­ity of QM by sum­ming over all al­gorithms that can gen­er­ate the pre­dic­tions of QM, rather than just tak­ing the short­est one. In that case you should prob­a­bly take the same ap­proach to defin­ing K-com­plex­ity of bit strings: sum over all al­gorithms that print the string, not take the short­est one. Do you sub­scribe to that point of view?

• It seems you want to define the com­plex­ity of QM by sum­ming over all al­gorithms that can gen­er­ate the pre­dic­tions of QM, rather than just tak­ing the short­est one.

Yes, though to be clear, it is the prior prob­a­bil­ity as­so­ci­ated with the com­plex­ity of the in­di­vi­d­ual al­gorithm that I would sum over to get the prior prob­a­bil­ity of that com­mon set of pre­dic­tions be­ing cor­rect. I don’t con­sider the com­mon set of pre­dic­tions to have a con­cep­tially use­ful com­plex­ity in the same sense that the al­gorithms do.

In that case you should prob­a­bly take the same ap­proach to defin­ing K-com­plex­ity of bit strings: sum over all al­gorithms that print the string, not take the short­est one. Do you sub­scribe to that point of view?

I would ap­ply the same ap­proach to mak­ing pre­dic­tions about bit strings.

• I don’t con­sider the com­mon set of pre­dic­tions to have a con­cep­tially use­ful com­plex­ity in the same sense that the al­gorithms do.

Why? Both are bit strings, no?

• My com­puter rep­re­sents num­bers and let­ters as bit strings. This doesn’t mean it makes sense to mul­ti­ply let­ters to­gether.

• This is re­lated to a point that I at­tempted to make pre­vi­ously. You can mea­sure com­plex­ity, but you must pick the con­text ap­pro­pri­ately.

• “But imag­ine you re­fac­tor your pre­dic­tion-gen­er­at­ing pro­gram and make it shorter; does this mean the phys­i­cal the­ory has be­come sim­pler?”

Yeah, (given the caveats already men­tioned by Vladimir), as any phys­i­cal the­ory is a pre­dic­tion-gen­er­at­ing pro­gram. A the­ory that isn’t a pre­dic­tion-gen­er­at­ing pro­gram isn’t a the­ory at all.

• I think that while a sleek de­cod­ing al­gorithm and a mas­sive look-up table might be math­e­mat­i­cally equiv­a­lent, they differ markedly in what sort of pro­cess ac­tu­ally car­ries them out, at least from the POV of an ob­server on the same ‘meta­phys­i­cal level’ as the pro­cess. In this case, the look-up table is es­sen­tially the pro­gram-that-lists-the-re­sults, and the al­gorithm is the short­est de­scrip­tion of how to get them. The equiv­alence is be­cause, in some kind of sense, pro­cess and re­sults im­ply each other. In my mind, this a bit like some kind of space-like-in­for­ma­tion and time-like-in­for­ma­tion equiv­alence, or as that be­tween a holo­gram and the sur­face it’s pro­jected from.

In the end, how are we to ever pre­fer one kind of de­scrip­tion over the other? I can only think that it ei­ther comes down to some ar­bi­trary aes­thetic ap­pre­ci­a­tion of el­e­gance, or maybe some kind of match be­tween the form of de­scrip­tion and how it fits in with our POV; our minds can be de­scribed in many ways, but only one cor­re­sponds di­rectly with how we ob­serve our­selves and re­al­ity, and we want any model to de­scribe our minds with as min­i­mal re-fram­ing as pos­si­ble.

Now, could some­one please tell me if what I have just said makes any kind of sense?!

• In the end, how are we to ever pre­fer one kind of de­scrip­tion over the other?

The min­i­mum size of an al­gorithm will de­pend on the con­text in which it is rep­re­sented. To mean­ingfully com­pare min­i­mum al­gorithm sizes we must choose a con­text that rep­re­sents the es­sen­tial en­tities and re­la­tion­ships of the do­main in con­sid­er­a­tion.

• The min­i­mum size of an al­gorithm will de­pend on the con­text in which it is represented

Isn’t one of the ba­sic re­sults of Kol­mogorov com­plex­ity/​in­for­ma­tion the­ory that al­gorithms/​pro­grams can be con­verted from one for­mal­ism/​do­main to an­other with a con­stant-size pre­fix/​penalty and hence there will be only a con­stant fac­tor penalty in # of bits needed to dis­t­in­guish the right al­gorithm in even the most bi­ased for­mal­ism?

• Isn’t one of the ba­sic re­sults of Kol­mogorov com­plex­ity/​in­for­ma­tion the­ory that al­gorithms/​pro­grams can be con­verted from one for­mal­ism/​do­main to an­other with a con­stant-size pre­fix/​penalty...

I be­lieve that my point holds.

This con­stant-size pre­fix be­comes part of the con­text in which the al­gorithm is rep­re­sented. One way to think about it is that the pre­fix cre­ates an in­ter­pre­ta­tion layer which trans­lates the al­gorithm from its do­main of im­ple­men­ta­tion to the sub­strate do­main.

To restate my point in these new terms, the pre­fix must be cho­sen to provide the ap­pro­pri­ate model of the do­main un­der con­sid­er­a­tion, to the al­gorithms be­ing com­pared. It does not make sense to con­sider al­gorithms im­ple­mented un­der differ­ent do­main mod­els (differ­ent pre­fixes).

For ex­am­ple if I want to com­pare the com­plex­ity of 3sat ex­pres­sions, then I shouldn’t be con­sid­er­ing al­gorithms in do­mains that sup­port mul­ti­pli­ca­tion.

• Another way to think of the con­stant-size pre­fix is that one can choose any com­puter lan­guage in which to write the pro­gram which out­puts the string, and then en­code a com­piler for that lan­guage in the pre­fix.

This works fine for the­ory: af­ter all, K-com­plex­ity is not com­putable, so we re­ally are in the do­main of the­ory here. For prac­ti­cal situ­a­tions (even stretch­ing the term “prac­ti­cal” to in­clude QM in­ter­pre­ta­tions!), if the length of the pre­fix is non-neg­ligible com­pared to the length of the pro­gram, then we can get mis­lead­ing re­sults. (I would love a cor­rec­tion or some help in sup­port­ing my in­tu­ition here.)

As a re­sult, I think I agree that the choice of rep­re­sen­ta­tion mat­ters.

How­ever, I don’t agree that there is a prin­ci­pled way of choos­ing the right rep­re­sen­ta­tion. There is no such thing as the sub­strate do­main. Phrases such as “the es­sen­tial en­tities and re­la­tion­ships of the do­main” are too sub­jec­tive.

• ...if the length of the pre­fix is non-neg­ligible com­pared to the length of the pro­gram, then we can get mis­lead­ing re­sults.

For the pur­poses of com­plex­ity com­par­i­sons the pre­fix should be held con­stant across the al­gorithms. You should always be com­par­ing al­gorithms in the same lan­guage.

How­ever, I don’t agree that there is a prin­ci­pled way of choos­ing the right rep­re­sen­ta­tion.

You are cor­rect. I only use phrases such as “the es­sen­tial en­tities and re­la­tion­ships of the do­main” to out­line the na­ture of the prob­lem.

The prob­lem with com­par­ing the com­plex­ity of QM in­ter­pre­ta­tions is that our rep­re­sen­ta­tion of those in­ter­pre­ta­tions is ar­bi­trary. We can only guess at the proper rep­re­sen­ta­tion of QM. By choos­ing differ­ent rep­re­sen­ta­tions we could fa­vor one the­ory or the other as the most sim­ple.

• For the pur­poses of com­plex­ity com­par­i­sons the pre­fix should be held con­stant across the al­gorithms. You should always be com­par­ing al­gorithms in the same lan­guage.

Oh, that seems sen­si­ble. It makes the prob­lem of choos­ing the lan­guage even more acute though, since now we can ig­nore the de­scrip­tion length of the com­piler it­self, mean­ing that even crazy lan­guages (such as the lan­guage which out­puts En­cy­clo­pe­dia Brit­tan­ica with a sin­gle in­struc­tion) are in con­tention. The point of re­quiring the lan­guage to be en­coded in the pre­fix, and its length added to the de­scrip­tion length, is to pre­vent us from “cheat­ing” in this way.

I had always as­sumed that it was nec­es­sary to al­low the pre­fix to vary. Clearly “ab­cab­cabc” and “aaabb­bccc” re­quire differ­ent pre­fixes to ex­press them as suc­cinctly as pos­si­ble. In prin­ci­ple there’s no clear dis­tinc­tion be­tween a pre­fix which en­codes an en­tire new lan­guage and a pre­fix which just sets up a func­tion to take ad­van­tage of the reg­u­lar­i­ties of the string.

• In prin­ci­ple there’s no clear dis­tinc­tion be­tween a pre­fix which en­codes an en­tire new lan­guage and a pre­fix which just sets up a func­tion to take ad­van­tage of the reg­u­lar­i­ties of the string.

Yes, and this is im­por­tant to see. The split be­tween con­tent and con­text can be made any­where, but the mean­ing of the con­tent changes de­pend­ing on where the split is made.

If you al­low the pre­fix to change then you are con­sid­er­ing string lengths in terms of the base lan­guage. This lan­guage can bias the re­sult in re­la­tion to the prob­lem do­main that you are ac­tu­ally in­ter­ested in.

As I said above:

For ex­am­ple if I want to com­pare the com­plex­ity of 3sat ex­pres­sions, then I shouldn’t be con­sid­er­ing al­gorithms in do­mains that sup­port mul­ti­pli­ca­tion.

• At least the Quan­tum Im­mor­tal­ity is some­thing, what isn’t the same un­der the MWI or any other in­ter­pre­ta­tion of QM.

There is no QI out­side the MWI. Do you pos­tu­late any quan­tum im­mor­tal suicider in your MWI branch? No? Why not?

• Quan­tum im­mor­tal­ity is not ob­serv­able. You sur­viv­ing a quan­tum suicide is not ev­i­dence for MWI—no more than it is for ex­ter­nal ob­servers.

• What about me sur­viv­ing a thou­sand quan­tum suicides (with neglible odds of sur­vival) in a row?

• That only pro­vides ev­i­dence that you are de­ter­minedly suici­dal and that you will even­tu­ally suc­ceed.

• But I’d have fun with my re­al­ity-steer­ing an­thropic su­per­pow­ers in the mean­time.

• One of you would. The huge num­ber of other di­verg­ing yous would run into an un­pleas­ant sur­prise sooner or later.

• No. You’re com­par­ing the like­li­hood of 2 hy­poth­e­sis. The ob­ser­va­tion that you sur­vived 1000 good suicide at­tempts is much more likely un­der MWI than un­der Copen­hagen. Then you flip it around us­ing Bayes’ rule, and be­lieve in MWI.

But other Bayesi­ans around you should not agree with you. This is a case where Bayesi­ans should agree to dis­agree.

• First, not to be nit-picky but MWI != QI. Se­cond, if your suicide at­tempts are well doc­u­mented the branch in which you sur­vived would be pop­u­lated by Bayesi­ans who agreed with you, no?

• The Bayesi­ans wouldn’t agree with you, be­cause the ob­ser­va­tion that you sur­vived all those suicide at­tempts is, to them, equally likely un­der MWI or Copen­hagen.

• What is QI?

• Flip a quan­tum coin.

The ob­ser­va­tion that you sur­vived 1000 good suicide at­tempts is much more likely un­der MWI than un­der Copen­hagen.

Isn’t that like say­ing “Un­der MWI, the ob­ser­va­tion that the coin came up heads, and the ob­ser­va­tion that it came up tails, both have prob­a­bil­ity of 1”?

The ob­ser­va­tion that I sur­vive 1000 good suicide at­tempts has a prob­a­bil­ity of 1, but only if I con­di­tion on my be­ing ca­pa­ble of mak­ing any ob­ser­va­tion at all (i.e. al­ive). In which case it’s the same un­der Copen­hagen.

• The ob­ser­va­tion is that you’re al­ive. If the Quan­tum Im­mor­tal­ity hy­poth­e­sis is true you will con­tinue mak­ing that ob­ser­va­tion af­ter an ar­bi­trary num­ber of good suicide at­tempts. The prob­a­bil­ity that you will con­tinue mak­ing that ob­ser­va­tion if Quan­tum Im­mor­tal­ity is false is much smaller than one.

• The prob­a­bil­ity that there ex­ists an Everett branch in which I con­tinue mak­ing that ob­ser­va­tion is 1. I’m not sure if jump­ing straight to sub­jec­tive ex­pe­rience from that is jus­tified:

If P(I sur­vive|MWI) = 1, and P(I sur­vive|Copen­hagen) = p, then what is the rest of that prob­a­bil­ity mass in Copen­hagen in­ter­pre­ta­tion? Why is P(~(I sur­vive)|Copen­hagen) = 1-p and what does it re­ally de­scribe? It seems to me that call­ing it “I don’t make any ob­ser­va­tion” is jump­ing from sub­jec­tive ex­pe­riences back to ob­jec­tive. This looks like a con­fu­sion of lev­els.

ETA: And, of course, the prob­lem with “an­thropic prob­a­bil­ities” gets even harder when you con­sider copies and merg­ing, simu­la­tions, Teg­mark level 4, and Boltz­mann brains (The An­thropic Trilemma). I’m not sure if there even is a gen­eral solu­tion. But I strongly sus­pect that “you can prove MWI by quan­tum suicide” is an in­cor­rect us­age of prob­a­bil­ities.

• It even de­pends on philos­o­phy. Speci­fi­cally on whether fol­low­ing equal­ity holds.

I sur­vive = There (not nec­es­sar­ily in our uni­verse) ex­ists some­one who re­mem­bers ev­ery­thing I re­mem­ber now plus failed suicide I’m go­ing to con­duct now.

or

I sur­vive = There ex­ists some­one who don’t re­mem­ber ev­ery­thing I re­mem­ber now, but he acts as I would acted if I re­mem­ber what he re­mem­bers. (I’m not sure whether I cor­rectly ex­pressed sub­junc­tive mood)

• If P(I sur­vive|MWI) = 1, and P(I sur­vive|Copen­hagen) = p, then what is the rest of that prob­a­bil­ity mass in Copen­hagen in­ter­pre­ta­tion?

First, I’m gonna clar­ify some terms to make this more pre­cise. Let Y be a per­son psy­cholog­i­cally con­tin­u­ous with your pre­sent self. P(there is some Y that ob­serves sur­viv­ing a suicide at­tempt|Quan­tum im­mor­tal­ity) = 1. Note MWI != QI. But QI en­tails MWI. P(there is some Y that ob­serves sur­viv­ing a suicide at­tempt| ~QI) = p.

It fol­lows from this that P(~(there is some Y that ob­serves sur­viv­ing a suicide at­tempt)|~QI) = 1-p.

I don’t see a con­fu­sion of lev­els (what­ever that means).

ETA: And, of course, the prob­lem with “an­thropic prob­a­bil­ities” gets even harder when you con­sider copies and merg­ing, simu­la­tions, Teg­mark level 4, and Boltz­mann brains (The An­thropic Trilemma). I’m not sure if there even is a gen­eral solu­tion. But I strongly sus­pect that “you can prove MWI by quan­tum suicide” is an in­cor­rect us­age of prob­a­bil­ities.

I don’t know if this is the point you meant to make but the ex­is­tence of these other hy­pothe­ses that could im­ply an­thropic im­mor­tal­ity definitely does get in the way of pro­vid­ing ev­i­dence in fa­vor of Many Wor­lds through suicide. Sur­viv­ing in­creases the prob­a­bil­ity of all of those hy­pothe­ses (to differ­ent ex­tents but not re­ally enough to dis­t­in­guish them).

• First, I’m gonna clar­ify some terms to make this more pre­cise. Let Y be a per­son psy­cholog­i­cally con­tin­u­ous with your pre­sent self. P(there is some Y that ob­serves sur­viv­ing a suicide at­tempt|Quan­tum im­mor­tal­ity) = 1. Note MWI != QI. But QI en­tails MWI. P(there is some Y that ob­serves sur­viv­ing a suicide at­tempt| ~QI) = p.

It fol­lows from this that P(~(there is some Y that ob­serves sur­viv­ing a suicide at­tempt)|~QI) = 1-p.

I don’t see a con­fu­sion of lev­els (what­ever that means).

I still see a prob­lem here. Sub­sti­tute quan­tum suicide → quan­tum coin­flip, and sur­viv­ing a suicide at­tempt → ob­serv­ing the coin turn­ing up heads.

Now we have P(there is some Y that ob­serves coin fal­ling heads|MWI) = 1, and P(there is some Y that ob­serves coin fal­ling heads|Copen­hagen) = p.

So any spe­cific out­come of a quan­tum event would be ev­i­dence in fa­vor of MWI.

• I think that works ac­tu­ally. If you ob­serve 30 quan­tum heads in a row you have strong ev­i­dence in fa­vor of MWI. The quan­tum suicide thing is just a way of in­creas­ing the pro­por­tion of fu­ture you’s that have this in­for­ma­tion.

• If you ob­serve 30 quan­tum heads in a row you have strong ev­i­dence in fa­vor of MWI.

But then if I ob­served any string of 30 out­comes I would have strong ev­i­dence for MWI (if the coin is fair, “p” for any spe­cific string would be 2^-30).

• You have to spec­ify a par­tic­u­lar string to look for be­fore you do the ex­per­i­ment.

• Sorry, now I have no idea what we’re talk­ing about. If your ex­per­i­ment in­volves kil­ling your­self af­ter see­ing the wrong string, this is close to the stan­dard quan­tum suicide.

If not, I would have to see the prob­a­bil­ities to un­der­stand. My anal­y­sis is like this: P(I ob­serve string S | MWI) = P(I ob­serve string S | Copen­hagen) = 2^-30, re­gard­less of whether the string S is speci­fied be­fore­hand or not. MWI doesn’t mean that my next Everett branch must be S be­cause I say so.

• The rea­son why this doesn’t work (for coins) is that (when MWI is true) A=”my ob­ser­va­tion is heads” im­plies B=”some Y ob­serves heads”, but not the other way around. So P(B|A)=1, but P(A|B) = p, and af­ter plug­ging that into the Bayes for­mula we have P(MWI|A) = P(Copen­hagen|A).

Can you trans­late that to the quan­tum suicide case?

• Isn’t that like say­ing “Un­der MWI, the ob­ser­va­tion that the coin came up heads, and the ob­ser­va­tion that it came up tails, both have prob­a­bil­ity of 1”?

I have no the­o­ries about what you’re think­ing when you say that.

• Either you con­di­tion the ob­ser­va­tion (of sur­viv­ing 1000 at­tempts) on the ob­server ex­ist­ing, and you have 1 in both cases, or you don’t con­di­tion it on the ob­server and you have p^-1000 in both cases. You can’t have it both ways.

• It con­vinces you that MWI is true. Due to the na­ture of quan­tum suicide, though, you will strug­gle to share this rev­e­la­tion with any­one else.

• That’s the prob­lem—it shouldn’t re­ally con­vince him. If he shares all the data and pri­ors with ex­ter­nal ob­servers, his pos­te­rior prob­a­bil­ity of MWI be­ing true should end up the same as theirs.

It’s not very differ­ent from sur­viv­ing thou­sand clas­si­cal Rus­sian roulettes in a row.

ETA: If the chance of sur­vival is p, then in both cases P(I sur­vive) = p, P(I sur­vive | I’m there to ob­serve it) = 1. I think you should use the sec­ond one in ap­prais­ing the MWI...

ETA2: Ok maybe not.

• If he shares all the data and pri­ors with ex­ter­nal ob­servers, his pos­te­rior prob­a­bil­ity of MWI be­ing true should end up the same as theirs.

No; I think you’re us­ing the Au­mann agree­ment the­o­rem, which can’t be used in real life. It has many ex­ceed­ingly un­re­al­is­tic as­sump­tions, in­clud­ing that all Bayesi­ans agree com­pletely on all defi­ni­tions and all cat­e­gory judge­ments, and all their knowl­edge about the world (their par­ti­tion func­tions) is mu­tual knowl­edge.

In par­tic­u­lar, to deal with the quan­tum suicide prob­lem, the rea­soner has to use an in­dex­i­cal rep­re­sen­ta­tion, mean­ing this is knowl­edge ex­pressed by a propo­si­tion con­tain­ing the term “me”, where me is defined as “the agent do­ing the rea­son­ing”. A propo­si­tion that con­tains an in­dex­i­cal can’t be mu­tual knowl­edge. You can trans­form it into a differ­ent form in some­one else’s brain that will have the same ex­ten­sional mean­ing, but that per­son will not be able to de­rive the same con­clu­sions from it, be­cause some of their knowl­edge is also in in­dex­i­cal form.

(There’s a more ba­sic prob­lem with the Au­mann agree­ment the­o­rem—when it says, “To say that 1 knows that 2 knows E means that E in­cludes all P2 in N2 that in­ter­sect P1,” that’s an in­cor­rect us­age of the word “knows”. 1 knows that E in­cludes P1(w), and that E in­cludes P2(w). 1 con­cludes that E in­cludes P1 union P2, for some P2 that in­ter­sects P1. Not for all P2 that in­ter­sect P1. In other words, the the­o­rem is math­e­mat­i­cally cor­rect, but se­man­ti­cally in­cor­rect; be­cause the things it’s talk­ing about aren’t the things that the English gloss says it’s talk­ing about.)

• There are in­deed many cases where Au­mann’s agree­ment the­o­rem seems to ap­ply se­man­ti­cally, but in fact doesn’t ap­ply math­e­mat­i­cally. Would there be in­ter­est in a top-level post about how Au­mann’s agree­ment the­o­rem can be used in real life, cen­ter­ing mostly around learn­ing from dis­agree­ments rather than forc­ing agree­ments?

• I’d be in­ter­ested, but I’ll prob­a­bly dis­agree. I don’t think Au­mann’s agree­ment the­o­rem can ever be used in real life. There are sev­eral rea­sons, but the sim­plest is that it re­quires the peo­ple in­volved share the same par­ti­tion func­tion over pos­si­ble wor­lds. If I re­call cor­rectly, this means that they have the same func­tion de­scribing how differ­ent ob­ser­va­tions would re­strict the pos­si­ble wor­lds they are in. This means that the proof as­sumes that these two ra­tio­nal agents would agree on the im­pli­ca­tions of any shared ob­ser­va­tion—which is al­most equiv­a­lent to what it is try­ing to prove!

• I will in­clude this in the post, if and when I can pro­duce one I think is up to scratch.

• What if you rep­re­sented those dis­agree­ments over im­pli­ca­tions as com­ing from agents hav­ing differ­ent log­i­cal in­for­ma­tion?

• I don’t re­ally see what is the prob­lem with Au­mann’s in that situ­a­tion. If X com­mits suicide and Y watches, are there any fac­tors (like P(MWI), or P(X dies|MWI)) that X and Y nec­es­sar­ily dis­agree on (or them agree­ing would be com­pletely un­re­al­is­tic)?

• If joe tries and fails to com­mit suicide, joe will have the propo­si­tion (in SNAc­tor-like syn­tax)

ac­tion(agent(me), act(suicide)) sur­vives(me, suicide)

while jack will have the propositions

ac­tion(agent(joe), act(suicide)) sur­vives(joe, suicide)

They both have a rule some­thing like

MWI ⇒ for ev­ery X, act(X) ⇒ P(sur­vives(me, X) = 1

but only joe can ap­ply this rule. For jack, the rule doesn’t match the data. This means that joe and jack have differ­ent par­ti­tion func­tions re­gard­ing the ex­ten­sional ob­ser­va­tion sur­vives(joe, X), which joe rep­re­sents as sur­vives(me, X).

If joe and jack both use an ex­ten­sional rep­re­sen­ta­tion, as the the­o­rem would re­quire, then nei­ther joe nor jack can un­der­stand quan­tum im­mor­tal­ity.

• So you’re say­ing that the knowl­edge “I sur­vive X with prob­a­bil­ity 1” can in no way be trans­lated into ob­jec­tive rule with­out los­ing some in­for­ma­tion?

I as­sume the rules speak about sub­jec­tive ex­pe­rience, not about “some Everett branch ex­ist­ing” (so if I flip a coin, P(I ob­serve heads) = 0.5, not 1). (What do prob­a­bil­ities of pos­si­ble, mu­tu­ally ex­clu­sive out­comes of given ac­tion sum to in your sys­tem?)

Isn’t the trans­la­tion a mat­ter of ap­ply­ing con­di­tional prob­a­bil­ity? i.e. (P(sur­vives(me, X) = 1 ⇔ P(sur­vives(joe, X) | joe’s ex­pe­rience con­tinues = 1)

• I was ac­tu­ally go­ing off the idea that the vast ma­jor­ity − 100% minus pr(sur­vive all suicides) - of wor­lds would have the sub­ject dead at some point, so all those wor­lds would not be con­vinced. Sure, peo­ple in your branch might be­lieve you, but in (100 − 9.3x10^-302) per­cent of the branches, you aren’t there to prove that quan­tum suicide works. This means, I think, that the chance of you ex­ist­ing to prove that quan­tum suicide proves MWI to the rest of the world, the chance is equal to the chance of you sur­viv­ing in a nonMWI uni­verse.

I was go­ing to say well, if you had a test with a 1% chance of con­firm­ing X and a 99% chance of dis­con­firm­ing X, and you ran it a thou­sand times and made sure you pre­sented only the con­fir­ma­tions, you would be laughed at to sug­gest that X is con­firmed—but it is MWI that pre­dicts ev­ery quan­tum event comes out ev­ery re­sult, so only un­der MWI could you run the test a thou­sand times—so that would in­deed be pretty con­vinc­ing ev­i­dence that MWI is true.

Also: I only have a pass­ing fa­mil­iar­ity with Robin’s man­gled wor­lds, but at the power of nega­tive three hun­dred, it feels like a small enough ‘world’ to get ab­sorbed into the mass of wor­lds where it works a few times and then they ac­tu­ally do die.

• Sure, peo­ple in your branch might be­lieve you

The prob­lem I have with that is that from my per­spec­tive as an ex­ter­nal ob­server it looks no differ­ent than some­one flip­ping a coin (ap­pro­pri­ately weighted) a thou­sand times and get­ting thou­sand heads. It’s quite im­prob­a­ble, but the fact that some­one’s life de­pends on the coin shouldn’t make any differ­ence for me—the uni­verse doesn’t care.

Of course it also doesn’t con­vince me that the coin will fall heads for the 1001-st time.

(That’s only if I con­sider MWI and Copen­hagen here. In re­al­ity af­ter 1000 coin flips/​suicides I would start to strongly sus­pect some al­ter­na­tive hy­pothe­ses. But even then it shouldn’t change my con­fi­dence of MWI rel­a­tive to my con­fi­dence of Copen­hagen).

• If the chance of sur­vival is p, then in both cases P(I sur­vive) = p, P(I sur­vive | I’m there to ob­serve it) = 1.

In­deed, the an­thropic prin­ci­ple ex­plains the re­sult of quan­tum suicide, whether or not you sub­scribe to the MWI. The real ques­tion is whether you ought to com­mit quan­tum suicide (and har­ness its an­thropic su­per­pow­ers for good). It’s a ques­tion of moral­ity.

• I would say quan­tum suicid­ing is not “har­ness­ing its an­thropic su­per­pow­ers for good”, it’s just con­ve­niently ex­clud­ing your­self from the branches where your su­per­pow­ers don’t work. So it has no more pos­i­tive im­pact on the uni­verse than you dy­ing has.

• I think you are cor­rect.

• Re­lated (some­what): The Hero With A Thou­sand Chances.

• If it’s not ob­serv­able, what differ­ence than does it make?

• It makes no differ­ence. Its a thought ex­per­i­ment about the con­se­quences of MWI, but it isn’t a testable pre­dic­tion.

• Of course it is testable. Just do some 30 quan­tum coin flips in a row. If any of them is a head, knock your­self down into a deep sleep (with anes­the­sia) for 24 hours.

If you are still awake 1 hour af­ter the coin failed for the last time, QI is prob­a­bly the fact.

• Nah. QI re­lies on your “sub­jec­tive thread” com­ing to an end in some wor­lds and con­tin­u­ing in oth­ers. In your ex­per­i­ment I’d be pretty cer­tain to get knocked out and wake up af­ter 24 hours.

• How does the Mul­ti­verse know, I am just sleep­ing for 24 (or 24000) hours? How the Mul­ti­verse knows, I’ll not be res­cued af­ter the real suicide at­tempt af­ter a quan­tum coin head popped up?

Or re­s­ur­rected by some ul­tra­t­ech?

Where is the fine red line, that the Quan­tum Im­mor­tal­ity is pos­si­ble, but a Quan­tum Awak­en­ing de­scribed above—isn’t?

• How does the Mul­ti­verse know

It doesn’t, not right now in the pre­sent mo­ment. But there’s no rea­son why “sub­jec­tive threads” and “sub­jec­tive prob­a­bil­ities” should de­pend on phys­i­cal laws only lo­cally. Imag­ine you’re an al­gorithm run­ning on a com­puter. If some­one pauses the com­puter for a thou­sand years, af­ter­wards you go on run­ning like noth­ing hap­pened, even though at the mo­ment of paus­ing no­body “knew” when/​if you’d be restarted again.

• If some­one pauses the com­puter for a thou­sand years, af­ter­wards you go on run­ning like noth­ing hap­pened, even though at the mo­ment of paus­ing no­body “knew” when/​if you’d be restarted again.

But what if a new com­puter arises ev­ery time and an in­stance of this al­gorithm start there?

As it allegedly does in MW?

• How does the Mul­ti­verse know, I am just sleep­ing for 24 (or 24000) hours? How the Mul­ti­verse knows, I’ll not be res­cued af­ter the real suicide at­tempt af­ter a quan­tum coin head popped up?

Be­cause you won’t be back. Uni­verse has the whole eter­nity to just wait for you to come back. If you don’t, the only re­main­ing ones that keep on ex­pe­rienc­ing from where you left off are the branches where coin didn’t come heads.

• I see. The MW has a book of those who will wake up and those who will not?

And acts ac­cord­ingly. Splits or not.

I do not buy this, of course.

• It’s a good thought to re­ject.

In fact, quan­tum im­mor­tal­ity has lit­tle to do with the ac­tual prop­er­ties of the uni­verse, as long as it’s prob­a­bil­is­tic. It’s just what hap­pens when you ar­bi­trar­ily (well, an­throp­i­cally) de­cide to stop count­ing cer­tain pos­si­bil­ities.

• No, it always splits into two ev­erett branches. It’s just that if you do in fact wake up in the dis­tant fu­ture, that ver­sion of you that wakes up will be a suc­ces­sor of the you that is awake now, as is the ver­sion of you that never went to sleep in the next microsec­ond (or what­ever). And you should an­ti­ci­pate ei­ther’s ex­pe­riences equally.

Or at least that’s how I think it works (this as­sumes time­less physics, which I think is what Jonii as­sumed).

• There are two prob­lems with this test.

First, the re­sult of a coin flip is al­most cer­tainly de­ter­mined by start­ing con­di­tions. With enough knowl­edge of those con­di­tions you could pre­dict the re­sult. In­stead you should make a mea­sure­ment on a quan­tum sys­tem, such as mea­sur­ing the spin of an elec­tron.

Se­cond the re­sult of this test does not dis­t­in­guish be­tween QI and not-QI. The prob­a­bil­ity of be­ing knocked out or left awake is the same in both cases.

I sup­pose you could be as­sum­ing that your con­scious­ness can jump ar­bi­trar­ily be­tween uni­verses to fol­low a con­scious ver­sion of you.… but no that would just be silly.

• In­stead you should make a mea­sure­ment on a quan­tum sys­tem, such as mea­sur­ing the spin of an elec­tron.

This is prob­a­bly what Thomas meant by “quan­tum” coin flip.

• You are right, I missed that. I prob­a­bly shouldn’t post com­ments when I’m hun­gry, I’ve got a few other com­ments like this to ac­count for as well. :)

• You might have missed the part where Thomas made it a “quan­tum coin flip”. The prob­lem with the test is that by defi­ni­tion is can’t be repli­cated suc­cess­fully by the sci­en­tific com­mu­nity and that even if QI is true you will get dis-con­firm­ing ev­i­dence in most Everett branches.

• I don’t pos­tu­late any­thing, what it is not already pos­tu­lated in the so called Quan­tum Suicide men­tal ex­per­i­ment.

I just ap­ply this on to the sleep­ing/​coma case. Should work the same.

But I don’t think it works in ei­ther case.

• I don’t pos­tu­late any­thing, what it is not already pos­tu­lated in the so called Quan­tum Suicide men­tal ex­per­i­ment.

The test you pro­posed does not dis­t­in­guish be­tween QI and not-QI. I don’t think that the cur­rent for­mu­la­tion of MWI even al­lows this to be tested.

I just ap­ply this on to the sleep­ing/​coma case. Should work the same.

Not a fac­tor to my ar­gu­ment, both are untestable. You are ar­gu­ing this point against other oth­ers, not me.

• First, the re­sult of a coin flip is al­most cer­tainly de­ter­mined by start­ing con­di­tions. With enough knowl­edge of those con­di­tions you could pre­dict the re­sult.

If that’s a valid ob­jec­tion, then quan­tum suicide won’t work ei­ther. In fact, if that’s a valid ob­jec­tion, then many-wor­lds is im­pos­si­ble, since ev­ery­thing is de­ter­minis­tic with no pos­si­ble al­ter­na­tives.

• Many-wor­lds is a de­ter­minis­tic the­ory, as it says that the split con­figu­ra­tions both oc­cur.

Quan­tum im­mor­tal­ity, mind you, is a very silly idea for a va­ri­ety of other rea­sons—fore­most of which is that a google­plex of uni­verses still doesn’t en­sure that there ex­ists one of them in which a rec­og­niz­able “you” sur­vives next week, let alone to the end of time.

• I just added to the post.