Pure Bayesianism vs Newcomb’s Paradox

In this ar­ti­cle, I an­a­lyze New­comb’s para­dox with pure Bayesi­anism. The fo­cus will be on the com­par­i­son of coun­ter­fac­tual ex­pected gains and of the 1-boxer and 2-boxer strate­gies, us­ing only the laws of prob­a­bil­ity. The main trick of the anal­y­sis is a clear sep­a­ra­tion be­tween agents’ epistemic think­ing (as­sumed to be pure Bayesi­anism) and their de­ci­sion-mak­ing pro­cesses.

I prove that en­gag­ing in coun­ter­fac­tual op­ti­miza­tion im­plies 2-box­ing. Weirdly enough, though, I also show the player can be­lieve that , if the player thinks that the pre­dic­tor Omega knows suffi­ciently more about the player’s de­ci­sion-mak­ing al­gorithm than the player does them­self. In such a case, coun­ter­fac­tual op­ti­miza­tion would fa­vor 1-box­ing. How­ever, even in this case, as we will ex­plain in greater de­tails, this does not mean that the player should then en­gage in coun­ter­fac­tual rea­son­ing.

I will then make a few re­marks about non-Bayesi­ans, be­fore con­clud­ing.

A slightly gen­er­al­ized New­comb’s paradox

New­comb’s para­dox is a clas­si­cal para­dox of de­ci­sion the­ory, which has been widely dis­cussed by philoso­phers, as well as here on LessWrong. Ev­i­dently, I have not dug through the en­tire (mas­sive!) liter­a­ture on the prob­lem, so I apol­o­gize if the anal­y­sis be­low has been done el­se­where.

Let’s start by de­scribing the prob­lem. Two boxes A and B are pre­sented to Alice.

  • Box A is opaque. Its con­tent is un­known to Alice. But Alice knows that an al­gorithm called Omega already de­cided what amount to put in, ac­cord­ing to prin­ci­ples that we shall de­tail later on. The con­tent of Box A can no longer be changed.

  • Box B is trans­par­ent and con­tains $1,000.

Alice must choose be­tween the 1-boxer strat­egy and the 2-boxer strate­gies.

  • The 1-boxer only takes Box A.

  • The 2-boxer takes both Box A and Box B.

What makes New­comb’s para­dox in­ter­est­ing is the way Omega de­cides what to put in Box A. Essen­tially, Omega will put a large amount of money in Box A if it pre­dicts the player to choose the 1-boxer strat­egy.

To make it re­al­is­tic, here, we typ­i­cally as­sume that Omega gath­ered a huge amount of data on Alice like, say, her en­tire Google, Face­book and Ama­zon data. Based on this and all sorts of other data, such as this on­line dis­cus­sion about New­comb’s para­dox, Omega made a prob­a­bil­is­tic guess about what strat­egy Alice will choose. In this ar­ti­cle, we’ll as­sume that Omega is a pure Bayesian, in the sense that the prob­a­bil­is­tic guess is de­rived from the laws of prob­a­bil­ity.

Denote the prob­a­bil­ity es­ti­mated by Omega that Alice will fol­low the 1-boxer strat­egy. Here, we as­sume that Omega flips a bi­ased coin, so that, with prob­a­bil­ity , it puts $1,000,000 in Box A. Other­wise, Omega leaves Box A empty.

(The clas­si­cal ver­sion of New­comb’s para­dox is the spe­cial case where the prob­a­bil­ity always ei­ther equals 0 or 1)

Alice knows all of this. The two boxes are in front of her. She can ei­ther take take Box A only (1-box­ing), or Alice can take both Box A and Box B (2-box­ing).

Imag­ine you’re Alice. What do you do?


New­comb as a coun­ter­fac­tual problem

Hope­fully, you’ve already had end­less de­bates about this fa­mous para­dox, in­clud­ing with peo­ple you dis­agree with. Essen­tially, the 1-boxer will say that their ex­pected gain is larger than the 2-boxer’s. The 2-boxer will re­spond that what mat­ters is that they got $1,000 more than what they would have got, had they taken only one box.

It turns out that both the 1-boxer and the 2-boxer are en­gag­ing in coun­ter­fac­tual rea­son­ing. In other words, they com­pare their ex­pected gain as they fol­low their strat­egy, to what they would have got in ex­pec­ta­tion by fol­low­ing the op­po­site strat­egy. In prob­a­bil­is­tic terms, both com­pared to , where is the con­tent of Box A, is the con­tent of Box B, 1 cor­re­sponds to the 1-boxer strat­egy, and 2 cor­re­sponds to the 2-boxer strat­egy.

Ev­i­dently, the crux of the prob­lem is the un­cer­tainty on the con­tent of Box A, which af­fects the prob­a­bil­ity com­pu­ta­tion. The 1-boxer claims that which leads them to con­clude that . How­ever, the 2-boxer ar­gues that , which leads to the con­clu­sion .

So which is it? Who’s right?

Pure Bayesi­anism*

In causal de­ci­sion the­ory, it is com­mon to con­sider ex­pec­ta­tions and , where the do-op­er­a­tor con­sists of forc­ing the prob­a­bil­ity dis­tri­bu­tion over all vari­ables that are not “caused” by event 1 and 2 to re­main the same. Since the con­tent of the box is usu­ally con­sid­ered not to be “caused” by event 1 and 2, it is com­mon in causal de­ci­sion the­ory to con­sider that , in which case . It is thus com­mon in causal de­ci­sion the­ory to ar­gue that the 2-boxer strat­egy is the bet­ter strat­egy.

An­noy­ingly, causal de­ci­sion the­ory does have “bugs”. In par­tic­u­lar, it is ar­guably severely “model-de­pen­dent”, in the sense that our choice of the causal model af­fects what the do-op­er­a­tor re­ally means. Thus, two causal de­ci­sion the­o­rists may still dis­agree, be­cause they choose to model causal­ity differ­ently — this “bug” ap­plies to “causal­ity” more gen­er­ally. One way to fix this would be to con­sider a uni­ver­sal prior over all causal mod­els. But then, one might won­der why we dog­mat­i­cally chose to re­strict our­selves to causal mod­els only, thereby steer­ing us away from the uni­ver­sal prior over all com­putable the­o­ries that Solomonoff cre­ated for us.

In any case, by adding the do-op­er­a­tor, causal de­ci­sion the­ory is not purely Bayesian; it is adding an ax­iom to the laws of prob­a­bil­ity. But this is not spe­cific to causal de­ci­sion the­ory. More gen­er­ally, the trou­ble is that pure Bayesi­anism does not model de­ci­sion-mak­ing. Granted, it may seem “nat­u­ral” to make de­ci­sions based on the max­i­miza­tion of ex­pected fu­ture re­wards, con­di­tioned on all ob­served data. But it should be stressed that this is one way of mak­ing de­ci­sions, which is no way de­rived from the laws of prob­a­bil­ity them­selves. So to be sure we’re go­ing for pure Bayesi­anism, let’s be ag­nos­tic to de­ci­sion-mak­ing pro­cesses.

This means in par­tic­u­lar that we will clearly sep­a­rate an in­di­vi­d­ual’s epistemic think­ing from its de­ci­sion-mak­ing pro­cess. The epistemic think­ing com­putes how the in­di­vi­d­ual is view­ing the world. It is the in­di­vi­d­ual’s best at­tempt to de­scribe the world as it is, in­clud­ing ob­jects in the world such as the in­di­vi­d­ual them­self. In this ar­ti­cle, we will as­sume that both Alice and Omega will rely on pure Bayesi­anism to do so.

What I mean by pure Bayesi­anism here can be Solomonoff’s in­duc­tion, which runs on a uni­ver­sal prior over all com­putable prob­a­bil­ity dis­tri­bu­tions over data. But I won’t need the full ar­tillery of Solomonoff. Let’s just con­sider that any pre­dic­tion is de­rived from ap­ply­ing solely the laws of prob­a­bil­ity with some ini­tial prior on any ob­serv­able data.

Note that this means that we are ig­nor­ing com­pu­ta­tional com­plex­ity con­straints, at least for now. We will come back to this point later on. Ar­guably, what makes this prob­lem in­ter­est­ing in prac­tice is pre­cisely the fact that Bayesian com­pu­ta­tions are in­tractable (if not un­com­putable!). But for now, we will sim­ply as­sume that both Alice and Omega have in­finite com­pu­ta­tional power, which al­lows them to run any ter­mi­nat­ing al­gorithm in­stantly, and to im­me­di­ately iden­tify al­gorithms that never halt as such.

Now, Omega’s de­ci­sion-mak­ing pro­cess will ex­ploit its epistemic think­ing, as has been de­scribed above. Namely, it will de­cide to put $1M in Box A with prob­a­bil­ity equal to its pre­dic­tion that Alice will choose the 1-boxer strat­egy.

How­ever, we will not as­sume any­thing about Alice’s de­ci­sion-mak­ing pro­cess. In fact, Alice’s ac­tual de­ci­sion-mak­ing al­gorithm may or may not ex­ploit her epistemic com­pu­ta­tions. Typ­i­cally, if Alice pre-com­mits to 1-box­ing, then her de­ci­sion-mak­ing al­gorithm will not be ex­ploit­ing her epistemic think­ing.

The zero-sur­prise theorem

Clearly, if Alice’s epistemic sys­tem perfectly knows her de­ci­sion-mak­ing al­gorithm, then this epistemic sys­tem can pre­dict the de­ci­sions she will make. In­deed, it can sim­ply run the de­ci­sion-mak­ing al­gorithm to de­rive a de­ter­minis­tic pre­dic­tion of Alice’s de­ci­sion. But this has a weird con­se­quence. In­deed, given all of her past data, Alice es­sen­tially can­not be sur­prised by her de­ci­sions given these data!

The rea­son why this re­mark is crit­i­cal is be­cause of an easy Bayesian the­o­rem, which I’ll call the zero-sur­prise the­o­rem.

The­o­rem (zero sur­prise). If a new piece of data is un­sur­pris­ing to a Bayesian, then this Bayesian does not change her be­lief by learn­ing . More pre­cisely, for any the­ory , if , then .
Proof. This fol­lows straight­for­wardly from Bayes rule, af­ter not­ing that, for , im­plies .

Thus, if Alice’s de­ci­sion-mak­ing fully fol­lows from her epistemic be­liefs, then she would be­lieve , for ei­ther (1-boxer) or (2-boxer). In both cases, the fact that she de­cides can­not sur­prise her. And thus, .

Now the 1-box­ers out there may ar­gue that Alice could still make ran­dom­ized de­ci­sion-mak­ing al­gorithms, for in­stance by throw­ing a coin at some point (or by draw­ing ran­dom bits), to make her de­ci­sion. Alas, a more gen­eral ver­sion of the zero-sur­prise the­o­rem ap­plies.

The­o­rem (equal sur­prise). If the like­li­hood of data is the same in all the­o­ries, i.e. for any pair of the­o­ries and , and as­sum­ing that the­o­ries par­ti­tion the prob­a­bil­ity space, then the pos­te­rior equals the prior, i.e. .
Proof. This also fol­lows from Bayes rule, by ex­pand­ing the de­nom­i­na­tor us­ing the law of to­tal prob­a­bil­ities and can­cel­ing like­li­hood terms.

As a re­sult, as­sum­ing that the ran­dom bits of Alices de­ci­sion-mak­ing al­gorithm are also un­known to Omega (when Omega es­ti­mated ), given that the prob­a­bil­ity of these bits will be the same whether Box A con­tains money or not, then Alice must con­clude , even if she goes for a ran­dom­ized de­ci­sion. In this case, no mat­ter what de­ci­sion-mak­ing al­gorithm Alice opts for, Alice must con­clude that the 2-boxer strat­egy yields coun­ter­fac­tu­ally larger ex­pected re­wards. Coun­ter­fac­tual rea­son­ing based on a trans­par­ent de­ci­sion-mak­ing al­gorithm fa­vors 2-box­ing.

The­o­rem (trans­par­ent coun­ter­fac­tual op­ti­miza­tion). If Alice’s de­ci­sion-mak­ing al­gorithm uses coun­ter­fac­tual op­ti­miza­tion, and if Alice’s epistemic think­ing knows that this is how de­ci­sions are made, then Alice would be a 2-boxer.

In fact, it can be shown that a suffi­cient con­di­tion to this is Alice’s abil­ity to pre­dict de­ter­minis­ti­cally Omega’s pre­dic­tion of Alice’s de­ci­sion-mak­ing al­gorithm. In­deed, Alice’s de­ci­sion will not change Alice’s cre­dence on Omega’s pre­dic­tion , which will thus not change her cre­dence on the con­tents of Box A.

Opaque de­ci­sion-mak­ing algorithm

Crit­i­cally, the above con­clu­sion con­sid­ers that, in a sense, Alice’s de­ci­sion-mak­ing al­gorithm is not fully known to Alice’s epistemic sys­tem. We got away by as­sum­ing that what Alice does not know about her al­gorithm is un­known to Omega too. But what if Omega had ac­cess to bits of Alice’s de­ci­sion-mak­ing al­gorithm that Alice’s epistemic sys­tem does not quite know? More in­ter­est­ingly still, what if Alice knows that Omega had a bet­ter knowl­edge of her own de­ci­sion-mak­ing al­gorithm when Omega made its pre­dic­tion and de­cided the con­tent of Box A?

This may sound out­landish. But it may not be so un­rea­son­able. After all, our own de­ci­sion-mak­ing al­gorithms are the re­sult of nu­mer­ous elec­tri­cal im­pulses in our brains’ com­plex neu­ral net­works. Clearly, they are opaque to us. In fact, it may be par­tic­u­larly hard for us to pre­dict how our de­ci­sion-mak­ing al­gorithm will be­have, es­pe­cially in un­usual set­tings — such as New­comb’s para­dox. And even if we’re given an or­a­cle to com­pute Bayes rule on all our ob­served data, this or­a­cle will likely still only re­turn a prob­a­bil­is­tic dis­tri­bu­tion over the set of al­gorithms that we ap­ply to make de­ci­sions, since it does not have pre­cise data about the topol­ogy of our brains.

Mean­while, to­day’s ma­chine learn­ing al­gorithms already col­lect huge amounts of data about a lot of differ­ent hu­mans. By lev­er­ag­ing data about how some hu­mans be­have in un­usual set­tings, it seems pos­si­ble for them to bet­ter in­fer how our de­ci­sion-mak­ing al­gorithms will be­have in such set­tings than we can in­fer it our­selves. After all, if we know what challenges our friends will face, and how peo­ple similar to our friends be­have in these challenges, we can some­times cor­rectly bet that our friends will be do­ing well, even when they are more hes­i­tant.

Well, the opac­ity of Alice’s de­ci­sion-mak­ing al­gorithm to her epistemic sys­tem changes ev­ery­thing. Things now be­come very in­ter­est­ing! In­tu­itively, the fact that Alice now con­cludes that she will be a 1-boxer re­veals to her fea­tures of her de­ci­sion-mak­ing al­gorithms, which will thus mod­ify her be­lief on what Omega put in the box. And thus, it is now pos­si­ble that she ac­tu­ally con­cludes that . In fact, we will provide an ex­am­ple where this is the case.

But be­fore get­ting to our ex­am­ple, for the 2-box­ers out there, it’s worth de­tailing what this means in­tu­itively. Of course, the con­tent of Box A will not change be­cause Alice chooses to opt for the 1-boxer strat­egy. What does change, how­ever, is Alice’s cre­dence of what Box A con­tains. The world did not move. But Alice’s model of the world did. And this change of be­lief mod­ifies her com­pu­ta­tion of her coun­ter­fac­tual ex­pected gains.

More knowl­edge­able Omega

To pre­sent a sim­ple ex­am­ple of , we will as­sume that Omega has ac­cess to all of Alice’s data, in ad­di­tion to some ad­di­tional data that may in­form Omega on Alice’s de­ci­sion-mak­ing sys­tem. This will sim­plify com­pu­ta­tions as it al­lows us to ap­ply the Bayesian the­o­rem of the ar­gu­ment of au­thor­ity.

The­o­rem (ar­gu­ment of au­thor­ity). If Alice and Omega are hon­est Bayesi­ans with the same fun­da­men­tal prior, if Omega knows strictly more data than Alice, and if Omega tells Alice about a pre­dic­tion it made, then Alice must now make Omega’s pre­dic­tion. More pre­cisely, for any data and any ran­dom vari­able , we have .
Proof. Alice has a prior on data . The event rules out all data D for which . Thus all data D that still have nonzero prob­a­bil­ity given must satisfy . De­com­pos­ing the com­pu­ta­tion of us­ing the law of to­tal prob­a­bil­ity over all pos­si­ble val­ues of given then yields an av­er­age over val­ues that all equal . This av­er­age must thus also equal .

An in­ter­est­ing corol­lary of the the­o­rem is that it is ac­tu­ally now suffi­cient to ex­press all the un­cer­tain­ties of the prob­lem in terms of Omega’s pre­dic­tion , rather than in terms of the un­cer­tainty on Alice’s de­ci­sion-mak­ing al­gorithm. In par­tic­u­lar, Alice can now eas­ily de­rive her be­lief on the fact that her de­ci­sion-mak­ing al­gorithm will make her choose the 1-boxer strat­egy.

Lemma 1. Alice’s prior on the fact that she will be a 1-boxer is equal to her ex­pected prior on Omega’s pre­dic­tion. In other words, de­not­ing , we have .
Proof. To ob­tain this, we first ap­ply the law of to­tal prob­a­bil­ity to note that . Now, we note that if Omega knows strictly more than Alice, thus if Omega pre­dicts that Alice will opt for the 1-boxer strat­egy with prob­a­bil­ity , then Alice must now be­lieve so. By the Bayesian the­o­rem of the ar­gu­ment of au­thor­ity, . As a re­sult, we have .

In­ter­est­ingly, we have a similar re­sult on the prior prob­a­bil­ity that Box A con­tains money.

Lemma 2. Denote the event where Box A con­tains $1M. Then . In other words, from Alice’s view­point, the prob­a­bil­ity that Box A con­tains $1M is Alice’s ex­pected prior on Omega’s pre­dic­tion.
Proof. . Now, as we saw, Omega’s de­ci­sion-mak­ing pro­cess is such that Omega puts $1M in Box A with prob­a­bil­ity . As a re­sult, we have .

We move on to the com­pu­ta­tion of the pos­te­rior prob­a­bil­ity that Box A con­tains $1M.

Lemma 3. , where is the prior var­i­ance of Omega’s pre­dic­tion.
Proof. Bayes rule yields . Us­ing as we saw in the two pre­vi­ous para­graphs, we now have .
Now, to com­pute , we use again the law of to­tal prob­a­bil­ity, by con­di­tion­ing over differ­ent val­ues of . This yields . Now, the fact that the coin throw of Omega is in­de­pen­dent of the choice of the in­di­vi­d­ual, con­di­tioned on the es­ti­ma­tion of ω by Omega, means that , which, as we saw, is equal to by virtue of the Bayesian the­o­rem of the ar­gu­ment of au­thor­ity. We thus have . For the other quan­tity , we use Bayes rule , which yields . We then note that, by defi­ni­tion of Omega’s choice for the con­tent of Box A, we have . As we already saw, the de­nom­i­na­tor is . Over­all, we ob­tain .
Us­ing al­lows to con­clude.

In other words, es­pe­cially if Alice has a large un­cer­tainty on Omega’s pre­dic­tion, while as­sum­ing that Omega is likely pre­dict­ing that she will be a 2-boxer, Alice will in­crease her cre­dence that Box A con­tains $1M if her de­ci­sion-mak­ing makes her choose Box A.

An im­me­di­ate corol­lary of this is that, con­versely, if the Bayesian opts for the 2-boxer strat­egy then her cre­dence in Box A con­tain­ing $1M de­creases. In­deed, we have the fol­low­ing Bayesian no-con­fir­ma­tion-bias the­o­rem.

The­o­rem (no con­fir­ma­tion bias). A Bayesian’s prior must be equal to the ex­pec­ta­tion of her pos­te­rior. She can­not ex­pect to in­crease her be­lief af­ter look­ing at more data. More pre­cisely, .
Proof. This is ex­actly the law of to­tal prob­a­bil­ity.

Thus, must be an av­er­age of and . But in fact, we can com­pute the last term ex­plic­itly.

Lemma 4. .
Proof. By Bayes rule, . But we have , and . Com­bin­ing it all yields .

Does it mean that we should opt for the 1-boxer strat­egy? Well, it de­pends on the de­tails of the com­pu­ta­tion. Let us de­note . we then have .

Main The­o­rem. Coun­ter­fac­tual op­ti­miza­tion fa­vors 1-box­ing in­equal­ity, if and only if, we have .
Proof. Lemma 3 im­plies . Mean­while, Lemma 4 im­plies . We then have . There­fore, we ob­serve that , if and only if, .

There­fore, the ex­pected gain is larger for the 1-boxer if the pos­si­ble large con­tent of Box A is much larger than that of Box B, if the un­cer­tainty on Omega’s pre­dic­tion is large, and if Omega’s pre­dic­tion is likely to be nearly de­ter­minis­tic. Ar­guably, this re­sult cap­tures some of the in­tu­ition we may have about this prob­lem, as the 1-boxer strat­egy may feel more com­pel­ling if and if we trust Omega to know much more about our own de­ci­sion-mak­ing pro­cess than we do.

Shouldn’t Alice en­gage in coun­ter­fac­tual op­ti­miza­tion?

Now, given all of this, as­sum­ing , should we con­clude that Alice must then choose to be a 1-boxer?

Well, tech­ni­cally, all we said is that if Alice fol­lows her de­ci­sion-mak­ing al­gorithm, if Omega knows strictly more than Alice, in par­tic­u­lar about her de­ci­sion-mak­ing al­gorithm, and if , then coun­ter­fac­tual rea­son­ing fa­vors 1-box­ing. But weirdly enough, this does not mean that Alice should de­cide to be 1-box­ing. In­deed, if Alice chooses her de­ci­sion-mak­ing al­gorithm, then this de­ci­sion-mak­ing al­gorithm will now be trans­par­ent to her, and the the­o­rem no longer ap­plies.

In fact, if Alice en­gages in coun­ter­fac­tual op­ti­miza­tion, then she will know her de­ci­sion-mak­ing al­gorithm, and thus, by virtue of the the­o­rem of trans­par­ent coun­ter­fac­tual op­ti­miza­tion we proved above, she will be us­ing the 2-boxer strat­egy. Put differ­ently, this the­o­rem says that, if you have a purely Bayesian epistemic sys­tem, then you can’t en­gage in coun­ter­fac­tual op­ti­miza­tion and de­cide to be a 1-boxer.

But say we’re now in the con­text of the pre­vi­ous sec­tion, where Alice’s de­ci­sion-mak­ing al­gorithm is opaque to her. And as­sume that . Then, we can ar­gue that Alice should be a 1-boxer, based on coun­ter­fac­tual op­ti­miza­tion. In fact, Alice’s own epistemic sys­tem may be shout­ing that she should be 1-box­ing. But this epistemic sys­tem is not Alice’s de­ci­sion-mak­ing sys­tem. This de­ci­sion-mak­ing al­gorithm can­not be changed.

But isn’t it just a wrong as­sump­tion? Can’t we de­cide that Alice can de­cide what de­ci­sion-mak­ing al­gorithm to ex­e­cute, per­haps even up to some noise?

Well, if so, then Alice’s de­ci­sion-mak­ing al­gorithm would ac­tu­ally be none other than the meta-al­gorithm that out­puts a de­ci­sion-mak­ing al­gorithm to be ex­e­cuted, and then ex­e­cute it. Yet a meta-al­gorithm is still an al­gorithm! And it is this meta-al­gorithm that may be bet­ter un­der­stood by Omega than by Alice her­self.

This may be frus­trat­ing. You might want to ask: Can’t we at least say that the 1-boxer wins more money than the 2-boxer strat­egy?

Yes, definitely, be­cause these are de­scrip­tive ques­tions. And in­deed, if both Alice and Omega know that Alice’s de­ci­sion-mak­ing will tell her to use the 1-boxer strat­egy, then Alice will know she will gain $1M. Con­versely, if both know that Alice will be a 2-boxer, then Alice will know she will gain $1,000. In the former case, Alice will win more than in the lat­ter case. But, as we saw ear­lier, be­cause of the zero-sur­prise the­o­rem, in both cases, . Thus, in both cases, Alice’s epistemic sys­tem ex­pected to win $1,000 more by tak­ing both boxes. In both cases, her coun­ter­fac­tual op­ti­miza­tion urged her to be a 2-boxer.

A few re­marks for non-Bayesians

Now, un­for­tu­nately, those of us who are mere mor­tals usu­ally don’t have ac­cess to a Bayesian or­a­cle. Given our limited com­pu­ta­tional re­sources, how should we find in­spira­tion from this Bayesian anal­y­sis of New­comb’s para­dox to draw con­clu­sions on what we should ac­tu­ally do in New­comb-like para­doxes?

The crit­i­cal fea­ture of our anal­y­sis ar­guably still stands. In a sense, our de­ci­sion-mak­ing al­gorithms are par­tial black boxes to our­selves. We are run­ning an al­gorithm whose code is un­known to us. As a re­sult, we should ex­pect to be sur­prised by some de­ci­sions we are mak­ing. And thus to learn some­thing once we find out about what our de­ci­sion-mak­ing sys­tems out­puts.

In­ter­est­ingly, if now con­sider com­pu­ta­tional com­plex­ity, this prin­ci­ple be­comes all the more com­pel­ling. In prac­tice, be­cause we have limited com­put­ing power, es­pe­cially un­der the as­sump­tion of com­pu­ta­tional ir­re­ducibil­ity, we should of­ten be sur­prised by the re­sult of our de­ci­sion-mak­ing pro­cess. This prin­ci­ple is beau­tifully cap­tured in this quote from Tur­ing’s 1950 brilli­ant pa­per.

The view that ma­chines can­not give rise to sur­prises is due, I be­lieve, to a fal­lacy to which philoso­phers and math­e­mat­i­ci­ans are par­tic­u­larly sub­ject. This is the as­sump­tion that as soon as a fact is pre­sented to a mind all con­se­quences of that fact spring into the mind si­mul­ta­neously with it. It is a very use­ful as­sump­tion un­der many cir­cum­stances, but one too eas­ily for­gets that it is false. A nat­u­ral con­se­quence of do­ing so is that one then as­sumes that there is no virtue in the mere work­ing out of con­se­quences from data and gen­eral prin­ci­ples.

To rephrase this, even when you know how you de­cide, you may still not know what you de­cide. You should then not be sur­prised that some al­gorithm like Omega re­ally can pre­dict what you de­cide bet­ter than you can.

In this case, if you some­how thinks that Omega does pre­dict bet­ter than you can, and if you are re­ally un­sure about what Omega will pre­dict for you, then ap­prox­i­mate coun­ter­fac­tual rea­son­ing may sug­gest that you might want to en­gage with 1-box­ing, though this set­ting with limited com­pu­ta­tional re­sources would need to be fur­ther in­ves­ti­gated to clear things out.

Weirdly, though, this sug­gests that the more you think about New­comb’s para­dox, the more (prob­a­bly) you de­crease your un­cer­tainty on Omega’s pre­dic­tion — as­sum­ing that it’s rea­son­ably close to a Bayesian pre­dic­tion. And thus, the more com­pel­ling the 2-boxer strat­egy may seem to you.


In this ar­ti­cle, we care­fully dis­t­in­guished an in­di­vi­d­ual’s de­ci­sion-mak­ing al­gorithm from its epistemic sys­tem. In­ter­est­ingly, this prin­ci­ple has been ar­gued to be crit­i­cal to ro­bust AI al­ign­ment. It is prob­a­bly use­ful for us hu­mans as well. The job of de­cid­ing what to do is not the same as the job of figur­ing out a re­li­able model of the world. In­ter­est­ingly, we com­monly as­sume that the de­ci­sion-mak­ing al­gorithm should ex­ploit the epistemic sys­tem. But per­haps a ne­glected re­flex­ion is the fact that the epistemic sys­tem must also de­scribe the de­ci­sion-mak­ing al­gorithm. In prac­tice, it is usu­ally un­for­tu­nate that our de­ci­sion-mak­ing al­gorithms are some­what opaque to our­selves — though, in the spe­cific case of New­comb’s para­dox, it may be just what we need to coun­ter­fac­tu­ally pre­fer 1-box­ing.

Note though that I’m leav­ing here open the ques­tion as to whether we should aim at coun­ter­fac­tual op­ti­miza­tion as a de­ci­sion rule. I per­son­ally still be­lieve that this is the most nat­u­ral form of de­ci­sion rule, es­pe­cially un­der a Bayesian frame­work when con­sid­er­ing the von Neu­mann—Mor­gen­stern prefer­ences. I gladly ac­knowl­edge, how­ever, my lack of un­der­stand­ing of al­ter­na­tives.

An im­por­tant fea­ture of this ar­ti­cle is a strong mo­ti­va­tion to con­sis­tently ap­ply Bayesi­anism for epistemic think­ing. This mo­ti­va­tion I have is the re­sult of years of won­ders within the formidable world of prob­a­bil­ity the­ory and its as­ton­ish­ing the­o­rems, a few of which have been men­tioned above. My fas­ci­na­tion for Bayes rule has cul­mi­nated in a book I re­cently pub­lished at CRC Press, called The Equa­tion of Knowl­edge, which I pre­sented in this LessWrong post. The book con­tains a large num­ber of ex­am­ples, many of which are pretty similar to what you have read in this ar­ti­cle. Others are more em­piri­cal. If you found this ar­ti­cle some­what in­ter­est­ing, I bet that you’ll re­ally en­joy the book.

(my de­ci­sion-mak­ing al­gorithm urged me to share truth­fully my epistemic think­ing about this!)