# Why Bayesians should two-box in a one-shot

Con­sider New­comb’s prob­lem.

Let ‘gen­eral’ be the claim that Omega is always right.

Let ‘in­stance’ be the claim that Omega is right about a par­tic­u­lar pre­dic­tion.

As­sume you, the player, are not told the rules of the game un­til af­ter Omega has made its pre­dic­tion.

Con­sider 2 var­i­ants of New­comb’s prob­lem.

1. Omega is a perfect pre­dic­tor. In this var­i­ant, you as­sign a prior of 1 to P(gen­eral). You are then obli­gated to be­lieve that Omega has cor­rectly pre­dicted your ac­tion. In this case Eliezer’s con­clu­sion is cor­rect, and you should one-box. It’s still un­clear whether you have free will, and hence have any choice in what you do next, but you can’t lose by one-box­ing.

But you can’t as­sign a prior of 1 to P(gen­eral), be­cause you’re a Bayesian. You de­rive your prior for P(gen­eral) from the (finite) em­piri­cal data. Say you be­gin with a prior of 0.5 be­fore con­sid­er­ing any ob­ser­va­tions. Then you ob­serve all of Omega’s N pre­dic­tions, and each time, Omega gets it right, and you up­date:

P(gen­eral | in­stance) = P(in­stance | gen­eral) P(in­stance) /​ P(gen­eral)
= P(in­stance) /​ P(gen­eral)

Omega would need to make an in­finite num­ber of cor­rect pre­dic­tions be­fore you could as­sign a prior of 1 to P(gen­eral). So this case is the­o­ret­i­cally im­pos­si­ble, and should not be con­sid­ered.

2. Omega is a “nearly perfect” pre­dic­tor. You as­sign P(gen­eral) a value very, very close to 1. You must, how­ever, do the math and try to com­pare the ex­pected pay­offs, at least in an or­der-of-mag­ni­tude way, and not just use ver­bal rea­son­ing as if we were me­dieval scholas­tics.

The ar­gu­ment for two-box­ing is that your ac­tion now can’t af­fect what Omega did in the past. That is, we are us­ing a model which in­cludes not just P(in­stance | gen­eral), but also the in­ter­ac­tion of your ac­tion, the con­tents of the boxes, and the claim that Omega can­not vi­o­late causal­ity. P ( P(\$1M box is empty | you one-box) = P(\$1M box is empty | you two-box) ) >= P(Omega can­not vi­o­late causal­ity), and that needs to be en­tered into the com­pu­ta­tion.

Numer­i­cally, two-box­ers claim that the high prob­a­bil­ity they as­sign to our un­der­stand­ing of causal­ity be­ing ba­si­cally cor­rect more than can­cels out the high prob­a­bil­ity of Omega be­ing cor­rect.

The ar­gu­ment for one-box­ing is that you aren’t en­tirely sure you un­der­stand physics, but you know Omega has a re­ally good track record—so good that it is more likely that your un­der­stand­ing of physics is false than that you can falsify Omega’s pre­dic­tion. This is a strict re­li­ance on em­piri­cal ob­ser­va­tions as op­posed to ab­stract rea­son: count up how of­ten Omega has been right and com­pute a prior.

How­ever, if we’re go­ing to be strict em­piri­cists, we should dou­ble down on that, and set our prior on P(can­not vi­o­late causal­ity) strictly em­piri­cally—based on all ob­ser­va­tions re­gard­ing whether or not things in the pre­sent can af­fect things in the past.

This in­cludes up to ev­ery par­ti­cle in­ter­ac­tion in our ob­serv­able uni­verse. The num­ber is not so high as that, as prob­a­bly a large num­ber of in­ter­ac­tions could oc­cur in which the fu­ture af­fects the past with­out our notic­ing. But the num­ber of ob­ser­va­tions any one per­son has made in which events in the fu­ture seem to have failed to af­fect events in the pre­sent is cer­tainly very large, and the ac­cu­mu­lated wis­dom of the en­tire hu­man race on the is­sue must provide more bits in fa­vor of the hy­poth­e­sis that causal­ity can’t be vi­o­lated, than the bits for Omega’s in­fal­li­bil­ity based on the com­par­a­tively paltry num­ber of ob­ser­va­tions of Omega’s pre­dic­tions, un­less Omega is very busy in­deed. And even if Omega has some­how made enough ob­ser­va­tions, most of them are as in­ac­cessible to you as ob­ser­va­tions of the laws of causal­ity work­ing on the dark side of the moon. You, per­son­ally, can­not have ob­served Omega make more cor­rect pre­dic­tions than the num­ber of events you have ob­served in which the fu­ture failed to af­fect the pre­sent.

You could com­pute a new pay­off ma­trix that made it ra­tio­nal to one-box, but the ra­tio be­tween the pay­offs would need to be many or­ders of mag­ni­tude higher. You’d have to com­pute it in utilons rather than dol­lars, be­cause the util­ity of dol­lars doesn’t scale lin­early. And that means you’d run into the prob­lem that hu­mans have some up­per bound on util­ity—they aren’t cog­ni­tively com­plex enough to achieve util­ity lev­els 10^10 times greater than “won \$1,000”. So it still might not be ra­tio­nal to one-box, be­cause the util­ity pay­off un­der the one box might need to be larger than you, as a hu­man, could ex­pe­rience.

## Pre-commitment

The case in which you get to think about what to do be­fore Omega stud­ies you and makes its de­ci­sion is more com­pli­cated, be­cause your prob­a­bil­ity calcu­la­tion then also de­pends on what you think you would have done be­fore Omega made its de­ci­sion. This only af­fects the par­ti­tion of your prob­a­bil­ity calcu­la­tion in which Omega can al­ter the past, how­ever, so nu­mer­i­cally it doesn’t make a big differ­ence.

The trick here is that most state­ments of New­comb’s are am­bigu­ous as to whether you are told the rules be­fore Omega stud­ies you, and as to which de­ci­sion they’re ask­ing you about when they ask if you one-box or two-box. Are they ask­ing about what you pre-com­mit to, or what you even­tu­ally do? Th­ese de­ci­sions are sep­a­rate, but not iso­lat­able.

As long as we fo­cus on the sin­gle de­ci­sion at the point of ac­tion, then the anal­y­sis above (mod­ified as just men­tioned) still fol­lows. If we ask what the player should plan to do be­fore Omega makes its de­ci­sion, then the ques­tion is just whether you have a good enough poker face to fool Omega. Here it takes no causal­ity vi­o­la­tion for Omega to fill the boxes in ac­cor­dance with your plans, so that fac­tor does not en­ter in, and you should plan to one-box.

If you are a de­ter­minis­tic AI, that im­plies that you will one-box. If you’re a GOFAI built ac­cord­ing to the old-fash­ioned sym­bolic logic AI de­signs talked about on LW (which, BTW, don’t work), it im­plies you will prob­a­bly one-box even if you’re not de­ter­minis­tic, as oth­er­wise you would need to be in­con­sis­tent, which is not al­lowed with GOFAI ar­chi­tec­tures. If you’re a hu­man, you’d the­o­ret­i­cally be bet­ter off if you could sud­denly see things differ­ently when it’s time to choose boxes, but that’s not psy­cholog­i­cally plau­si­ble. In no case is there a para­dox, or any real difficulty to the de­ci­sion to one-box.

## Iter­ated Games

Every­thing changes with iter­ated in­ter­ac­tions. It’s use­ful to de­velop a rep­u­ta­tion for one-box­ing, be­cause this may con­vince peo­ple that you will keep your word even when it seems dis­ad­van­ta­geous to you. It’s use­ful to con­vince peo­ple that you would one-box, and it’s even benefi­cial, in cer­tain re­spects, to spread the false be­lief in the Bayesian com­mu­nity that Bayesi­ans should one-box.

Read Eliezer’s post care­fully, and I think you’ll agree that the rea­son­ing Eliezer gives for one-box­ing is not that it is the ra­tio­nal solu­tion to a one-off game—it’s that it’s a win­ning policy to be the kind of per­son who one-boxes. That’s not an ar­gu­ment that the pay­off ma­trix of an in­stan­ta­neous de­ci­sion fa­vors one-box­ing; it’s an ar­gu­ment for a LessWron­gian moral­ity. It’s the same ba­sic ar­gu­ment as that hon­or­ing com­mit­ments is a good long-term strat­egy. But the way Eliezer stated it has given many peo­ple the false im­pres­sion that one-box­ing is ac­tu­ally the ra­tio­nal choice in an in­stan­ta­neous one-shot game (and that’s the only in­ter­pre­ta­tion which would make it in­ter­est­ing).

The one-box­ing ar­gu­ment is so ap­peal­ing be­cause it offers a solu­tion to difficult co­or­di­na­tion prob­lems. It makes it ap­pear that ra­tio­nal al­tru­ism and a ra­tio­nal utopia are within our reach.

But this is wish­ful think­ing, not math, and I be­lieve that the so­cial norm of do­ing the math is even more im­por­tant than a so­cial norm of one-box­ing.

• The ar­gu­ment for one-box­ing is that you aren’t en­tirely sure you un­der­stand physics, but you know Omega has a re­ally good track record—so good that it is more likely that your un­der­stand­ing of physics is false than that you can falsify Omega’s pre­dic­tion. This is a strict re­li­ance on em­piri­cal ob­ser­va­tions as op­posed to ab­stract rea­son: count up how of­ten Omega has been right and com­pute a prior.

Isn’t it that you aren’t en­tirely sure that you un­der­stand psy­chol­ogy, or that you do un­der­stand psy­chol­ogy well enough to think that you’re pre­dictable? My un­der­stand­ing is that many peo­ple have run New­comb’s Prob­lem-style ex­per­i­ments at philos­o­phy de­part­ments (or other places) and have a suffi­ciently high ac­cu­racy that it makes sense to one-box at such events, even against fal­lible hu­man pre­dic­tors.

• I can be­lieve that it would make sense to com­mit ahead of time to one-box at such an event. Do­ing so would af­fect your be­hav­ior in a way that the pre­dic­tor might pick up on.

Hmm. Think­ing about this con­vinces me that there’s a big prob­lem here in how we talk about the prob­lem, be­cause if we al­low peo­ple who already knew about New­comb’s Prob­lem to play, there are re­ally 4 pos­si­ble ac­tions, not 2:

• in­tended to one-box, one-boxed

• in­tended to one-box, two-boxed

• in­tended to two-box, one-boxed

• in­tended to two-box, two-boxed

I don’t know if the usual state­ment of New­comb’s prob­lem speci­fies whether the sub­jects learns the rules of the game be­fore or af­ter the pre­dic­tor makes a pre­dic­tion. It seems to me that’s a crit­i­cal fac­tor. If the sub­ject is told the rules of the game be­fore the pre­dic­tor ob­serves the sub­ject and makes a pre­dic­tion, then we’re just say­ing Omega is a very good lie de­tec­tor, and the prob­lem is not even about de­ci­sion the­ory, but about psy­chol­ogy: Do you have a good enough poker face to lie to Omega? If not, pre-com­mit to one-box.

We shouldn’t ask, “Should you two-box?”, but, “Should you two-box now, given how you would have acted ear­lier?” The var­i­ous prob­a­bil­ities in the pre­sent de­pend on what you thought in the past. Un­der the propo­si­tion that Omega is perfect at pre­dict­ing, the per­son in­clined to 2-box should still 2-box, ’coz that \$1M prob­a­bly ain’t there.

So New­comb’s prob­lem isn’t a para­dox. If we’re talk­ing just about the fi­nal de­ci­sion, the one made by a sub­ject af­ter Omega’s pre­dic­tion, then the sub­ject should prob­a­bly two-box (as ar­gued in the post). If we’re talk­ing about two de­ci­sions, one be­fore and one af­ter the box-open­ing, then all we’re ask­ing is whether you can con­vince Omega that you’re go­ing to one-box if you aren’t. Then it would not be ter­ribly hard to say that a pre­dic­tor might be so good (say, an Amaz­ing Kreskin-level cold-reader of hu­mans, or that you are an AI) that your only hope is to pre­com­mit to one-box.

• I don’t think this gets Parfit’s Hitch­hiker right. You need a de­ci­sion the­ory that, when safely re­turned to the city, pays the res­cuer even though they have no ex­ter­nal obli­ga­tion to do so. Other­wise they won’t have res­cued you.

• I don’t think that what you need has any bear­ing on what re­al­ity has ac­tu­ally given you. Nor can we talk about differ­ent de­ci­sion the­o­ries here—as long as we are talk­ing about max­i­miz­ing ex­pected util­ity, we have our de­ci­sion the­ory; that is enough speci­fi­ca­tion to an­swer the New­comb one-shot ques­tion. We can only ar­rive at a differ­ent out­come by stat­ing the prob­lem differ­ently, or by sneak­ing in differ­ent meta­physics, or by just do­ing bad logic (in this case, usu­ally al­low­ing con­tra­dic­tory be­liefs about free will in differ­ent parts of the anal­y­sis.)

Your com­ment im­plies you’re talk­ing about policy, which must be mod­el­led as an iter­ated game. I don’t deny that one-box­ing is good in the iter­ated game.

My con­cern in this post is that there’s been a lack of dis­tinc­tion in the com­mu­nity be­tween “one-box­ing is the best policy” and “one-box­ing is the best de­ci­sion at one point in time in a de­ci­sion-the­o­retic anal­y­sis, which as­sumes com­plete free­dom of choice at that mo­ment.” This lack of dis­tinc­tion has led many peo­ple into wish­ful or mag­i­cal rather than ra­tio­nal think­ing.

• I don’t think that what you need has any bear­ing on what re­al­ity has ac­tu­ally given you.

As far as I can tell, I would pay Parfit’s Hitch­hiker be­cause of in­tu­itions that were re­warded by nat­u­ral se­lec­tion. It would be nice to have a for­mal­iza­tion that agrees with those in­tu­itions.

or by sneak­ing in differ­ent metaphysics

This seems wrong to me, if you’re ex­plic­itly declar­ing differ­ent meta­physics (if you mean the thing by meta­physics that I think you mean). If I view my­self as a func­tion that gen­er­ates an out­put based on in­puts, and my de­ci­sion-mak­ing pro­ce­dure be­ing the search for the best such func­tion (for max­i­miz­ing util­ity), then this could be con­sid­ered as differ­ent meta­physics from try­ing to cause the most in­crease in util­ity for my­self by mak­ing de­ci­sions, but it’s not ob­vi­ous that the lat­ter leads to bet­ter de­ci­sions.

• You’re us­ing words like “rep­u­ta­tion”, and un­der­stand how hav­ing a rep­u­ta­tion for one-box­ing is prefer­able, when we’re dis­cussing the level where Omega has ac­cess to the source code of your brain and can just tell whether you’ll one-box or not, as a mat­ter of calcu­la­tion.

So the source-code of your brain just needs to de­cide whether it’ll be a source-code that will be one-box­ing or not. This isn’t re­ally about “pre­com­mitt­ment” for that one spe­cific sce­nario. Omega doesn’t need to know whether you have pre­comit­ted or not, Omega isn’t putting money in the boxes based on whether you have pre­com­mit­ted or not. It’s putting money based on the de­ci­sion you’ll ar­rive to, even if you your­self don’t know the de­ci­sion yet.

You can’t make the de­ci­sion in ad­vance, be­cause you may not know the ex­act pa­ram­e­ters of the de­ci­sion you’ll be asked to make (one-box­ing & two-box­ing are just ex­am­ples of one par­tic­u­lar type of de­ci­sion). You can de­cide how­ever whether you’re the sort of per­son who ac­cepts their de­ci­sions can be de­ter­minis­ti­cally pre­dicted in ad­vance with suffi­cient cer­tainty, or whether you’ll be claiming that other peo­ple pre­dict­ing your choice must be a vi­o­la­tion of causal­ity (it’s not).

• So the source-code of your brain just needs to de­cide whether it’ll be a source-code that will be one-box­ing or not.

First, in the clas­sic New­comb when you meet Omega that’s a sur­prise to you. You don’t get to pre­com­mit to de­cid­ing one way or the other be­cause you had no idea such a situ­a­tion will arise: you just get to de­cide now.

You can de­cide how­ever whether you’re the sort of per­son who ac­cepts their de­ci­sions can be de­ter­minis­ti­cally pre­dicted in ad­vance with suffi­cient cer­tainty, or whether you’ll be claiming that other peo­ple pre­dict­ing your choice must be a vi­o­la­tion of causal­ity (it’s not).

Why would you make such a de­ci­sion if you don’t ex­pect to meet Omega and don’t care much about philo­soph­i­cal head-scratch­ers?

And, by the way, pre­dict­ing your choice is not a vi­o­la­tion of causal­ity, but be­liev­ing that your choice (of the boxes, not of the source code) af­fects what’s in the boxes is.

Se­cond, you are as­sum­ing that the brain is free to re­con­figure and rewrite its soft­ware which is clearly not true for hu­mans and all ex­ist­ing agents.

• “Omega” is philo­soph­i­cal short­hand for “please ac­cept this part of the thought ex­per­i­ment as a premise”. New­comb’s prob­lem isn’t sup­posed to be re­al­is­tic, it’s sup­posed to iso­late a cor­ner-case in rea­son­ing and let us con­sider it apart from ev­ery­thing else.While it’s true that in re­al­ity you can’t as­sign prob­a­bil­ity 1 to Omega be­ing a perfect pre­dic­tor, the thought ex­per­i­ment nev­er­the­less asks you to do so any­ways—be­cause oth­er­wise the un­der­ly­ing is­sue would be too ob­scured by ir­rele­vant de­tails to solve it philo­soph­i­cally..

• If you rule out prob­a­bil­ities of 1, what do you as­sign to the prob­a­bil­ity that Omega is cheat­ing, and some­how gim­mick­ing the boxes to change the con­tents the in­stant you in­di­cate your choice, be­fore the con­tents are re­vealed?

Pre­sum­ably the mechanisms of “cor­rect pre­dic­tion” are ir­rele­vant, and once your ex­pec­ta­tion that this in­stance will be pre­dicted cor­rectly gets above mil­lion-to-one, you one-box.

• A di­a­gram of South­ern Idaho com­peti­tors who have ver­bally ded­i­cated to play at the uni­ver­sity level are of­ten the Com­peti­tors and they are ar­ranged by the sec­ondary school they go to play that sport.

On­line Es­say Writer

• Let’s say I build my Omega by us­ing a perfect pre­dic­tor plus a source of noise that’s un­cor­re­lated with the pre­dic­tion. It seems weird that you’d de­ter­minis­ti­cally two-box against such an Omega, even though you de­ter­minis­ti­cally one-box against a perfect pre­dic­tor. Are you sure you did the math right?

• It seems weird that you’d de­ter­minis­ti­cally two-box against such an Omega

Even in the case when the ran­dom noise dom­i­nates and the sig­nal is im­per­cep­ti­bly small?

• I think the more rele­vant case is when the ran­dom noise is im­per­cep­ti­bly small. Of course you two-box if it’s ba­si­cally ran­dom.

• So, at one point in my mis­spent youth I played with the idea of build­ing an ex­per­i­men­tal Omega and looked into the sub­ject in some de­tail.

In Martin Gar­diner’s writeup on this back in 1973 reprinted in The Night Is Large the es­say ex­plained that the core idea still works if Omega can just pre­dict with 90% ac­cu­racy.

Your choice of ONE box pays noth­ing if you’re pre­dicted (in­cor­rectly) to two box, and pays \$1M if pre­dicted cor­rectly at 90%, for a to­tal EV of \$900,000 (== (0.1)0 + (0.9)1,000,000).

Your choice of TWO box pays \$1k if you’re pre­dicted (cor­rectly) to two box, and pays \$1,001,000 if you’re pre­dicted to only one box for a to­tal EV of \$101k (== 900 + 100,100 == (0.9)1,000 + (0.1)1,001,000).

So the ex­pected profit from one box­ing in a nor­mal game, with Omega ac­cu­racy of 90% would be \$799k.

Also, by ad­just­ing the game’s pay­outs we could hy­po­thet­i­cally make any amount of gen­uine hu­man pre­dictabil­ity (even just a re­li­able 51% ac­cu­racy) be enough to mo­ti­vate one box­ing.

The su­per sim­plis­tic con­cep­tual ques­tion here is the dis­tinc­tion be­tween two kinds of sincer­ity. One kind of sincer­ity is as­sessed at the time of the promise. The other kind of sincer­ity is as­sessed ret­ro­spec­tively by see­ing whether the promise was up­held.

Then the stan­dard ver­sion of the game tries to put a wedge be­tween these con­cepts by sup­pos­ing that maybe an ini­tially sincere promise might be vi­o­lated by the in­ter­ven­tion of some­thing like “free will”, and it tries to make this seem slightly more mag­i­cal (more of a far mode ques­tion?) by imag­in­ing that the promise was never even ut­tered, but rather the promise was stolen from the per­son by the mag­i­cal mind read­ing “Omega” en­tity be­fore the promise was ever even imag­ined by the per­son as be­ing pos­si­ble to make.

One thing that seems clear to me is that if one box­ing is prof­itable but not cer­tain then you might wish you could have done some­thing in the past that would make it clear that you’ll one box, so that you land in the part of Omega’s calcu­la­tions where the pre­dic­tion is easy, rather than be­ing one of the edge cases where Omega re­ally has to work for its brier score.

On the other hand, the setup is also (prob­a­bly pur­pose­fully) quite fishy. The promise that “you made” is origi­nally im­plicit, and de­pend­ing on your un­der­stand­ing of the game maybe ex­tremely ab­stract. Omega doesn’t just tell you what it pre­dicted. If you get one box and get noth­ing and com­plain then Omega will prob­a­bly try to twist it around and blame you for its failed pre­dic­tion. If it all works then you seem to be get­ting free money, and why is any­one hand­ing out free money?

The whole thing just “feels like the setup for a scam”. Like you one box, get a mil­lion, then in your glow of pos­i­tive trust you give some money to their char­i­ta­ble cause. Then it turns out the char­i­ta­ble cause was fake. Then it turns out the mil­lion dol­lars was coun­terfeit but your dona­tion was real. Sucker!

And yet… you know, par­ents ac­tu­ally are pretty good at know­ing when their kids are tel­ling the truth or ly­ing. And par­ents re­ally do give their kids a free lunch. And it isn’t re­ally a scam, it is just nor­mal life as a mor­tal hu­man be­ing.

But also in the end, for some­one to look their par­ents in the eyes and promise to be home be­fore 10PM and re­ally mean it for re­als at the time of the promise, and then be given the car keys, and then come home at 1AM… that also hap­pens. And wouldn’t it be great to just blame that on “free will” and “the 10% of the time that Omega’s pre­dic­tions fail”?

Loop­ing this back around to the larger AGI ques­tion, it seems like what we’re ba­si­cally hop­ing for is to learn how to be­come a flawless Omega (or at least build some soft­ware that can do this job) at least for the re­stricted case of an AGI that we can give the car keys with­out fear that af­ter it has the car keys it will play the “free will” card and grind us all up into fuel paste af­ter promis­ing not to.

• What part of physics im­plies some­one can­not scan your brain and simu­late in­puts so as to perfectly pre­dict your ac­tions?

• The part of physics that im­plies some­one can­not scan your brain and simu­late in­puts so as to perfectly pre­dict your ac­tions is quan­tum me­chan­ics. But I don’t think in­vok­ing it is the best re­sponse to your ques­tion. Though it does make me won­der how Eliezer rec­on­ciles his thoughts on one-box­ing with his many-wor­lds in­ter­pre­ta­tion of QM. Doesn’t many-wor­lds im­ply that ev­ery game with Omega cre­ates wor­lds in which Omega is wrong?

If they can perfectly pre­dict your ac­tions, then you have no choice, so talk­ing about which choice to make is mean­ingless. If you be­lieve you should one-box based if Omega can perfectly pre­dict your ac­tions, but two-box oth­er­wise, then you are bet­ter off try­ing to two-box: In that case, you’ve already agreed that you should two=box if Omega can’t perfectly pre­dict your ac­tions. If Omega can, you won’t be able to two-box un­less Omega already pre­dicted that you would, so it won’t hurt to try to 2-box.

• If you find an Omega, then you are in an en­vi­ron­ment where Omega is pos­si­ble. Per­haps we are all simu­lated and QM is op­tional. Maybe we have eas­ily enough de­ter­minism in our brains that Omega can make pre­dic­tions, much as quan­tum me­chan­ics ought to in some sense pre­vent pre­dict­ing where a can­non­ball will fly but in prac­tice does not. Per­haps it’s a hy­po­thet­i­cal where we’re AI to be­gin with so de­ter­minis­tic be­hav­ior is just to be ex­pected.

• If they can perfectly pre­dict your ac­tions, then you have no choice, so talk­ing about which choice to make is mean­ingless.

This was ar­gued against in the Se­quences and in gen­eral, doesn’t seem to be a strong ar­gu­ment. It seems perfectly com­pat­i­ble to be­lieve your ac­tions fol­low de­ter­minis­ti­cally and still talk about de­ci­sion the­ory—all the func­tional de­ci­sion the­ory stuff is as­sum­ing a de­ter­minis­tic de­ci­sion pro­cess, I think.

Re QM: some­times I’ve seen it stipu­lated that the world in which the sce­nario hap­pens is de­ter­minis­tic. It’s en­tirely pos­si­ble that the amount of noise gen­er­ated by QM isn’t enough to af­fect your choice (be­sides for a very un­likely “your brain has a cou­ple bits changed ran­domly in ex­actly the right way to change your choice”, but that should be way too many or­ders of mag­ni­tude un­likely so as to not mat­ter in any ex­pected util­ity calcu­la­tion).

• This was ar­gued against in the Se­quences and in gen­eral, doesn’t seem to be a strong ar­gu­ment. It seems perfectly com­pat­i­ble to be­lieve your ac­tions fol­low de­ter­minis­ti­cally and still talk about de­ci­sion the­ory—all the func­tional de­ci­sion the­ory stuff is as­sum­ing a de­ter­minis­tic de­ci­sion pro­cess, I think.

It is com­pat­i­ble to be­lieve your ac­tions fol­low de­ter­minis­ti­cally and still talk about de­ci­sion the­ory. It is not com­pat­i­ble to be­lieve your ac­tions fol­low de­ter­minis­ti­cally, and still talk about de­ci­sion the­ory from a first-per­son point of view, as if you could by force of will vi­o­late your pro­gram­ming.

To ask what choice a de­ter­minis­tic en­tity should make pre­sup­poses both that it does, and does not, have choice. Pre­sup­pos­ing a con­tra­dic­tion means STOP, your rea­son­ing has crashed and you can prove any con­clu­sion if you con­tinue.

• It is not com­pat­i­ble to be­lieve your ac­tions fol­low de­ter­minis­ti­cally, and still talk about de­ci­sion the­ory from a first-per­son point of view,

So it’s the pro­nouns that mat­ter? If I keep us­ing “Aris Kat­saris” rather than “I” that makes a differ­ence to whether the per­son I’m talk­ing about makes de­ci­sions that can be de­ter­ministally pre­dicted?

Whether some­one can pre­dict your de­ci­sions has ZERO rele­vancy on whether you are the one mak­ing the de­ci­sions or not. This sort of con­fu­sion where peo­ple think that “free will” means “be­ing un­pre­dictable” is non­sen­si­cal—it’s the very op­po­site. For the de­ci­sions to be yours, they must be the­o­ret­i­cally pre­dictable, aris­ing from the con­tents of your brains. Ad­ding in ran­dom­ness and un­pre­dictabil­ity, like e.g. us­ing dice or coin­flips re­duces the own­er­ship of the de­ci­sions and hence the free will.

This is old and tired ter­ri­tory.

• Old and tired, maybe, but clearly there is not much con­sen­sus yet (even if, ahem, some peo­ple con­sider it to be as clear as day).

Note that who makes the de­ci­sion is a mat­ter of con­trol and has noth­ing to do with free­dom. A calcu­la­tor con­trols its dis­play and so the “de­ci­sion” to out­put 4 in re­sponse to 2+2 it its own, in a way. But ap­ply­ing de­ci­sion the­ory to a calcu­la­tor is non­sen­si­cal and there is no free choice in­volved.

• Have you read http://​​less­wrong.com/​​lw/​​rb/​​pos­si­bil­ity_and_could­ness/​​ and the re­lated posts and have some dis­agree­ment with them?

• I just now read that one post. It isn’t clear how you think it’s rele­vant. I’m guess­ing you think that it im­plies that posit­ing free will is in­valid.

You don’t have to be­lieve in free will to in­cor­po­rate it into a model of how hu­mans act. We’re all nom­i­nal­ists here; we don’t be­lieve that the con­cepts in our the­o­ries ac­tu­ally ex­ist some­where in Form-space.

When some­one asks the ques­tion, “Should you one-box?”, they’re us­ing a model which uses the con­cept of free will. You can’t ob­ject to that by say­ing “You don’t re­ally have free will.” You can ob­ject that it is the wrong model to use for this prob­lem, but then you have to spell out why, and what model you want to use in­stead, and what ques­tion you ac­tu­ally want to ask, since it can’t be that one.

Peo­ple in the LW com­mu­nity don’t usu­ally do that. I see sloppy state­ments claiming that hu­mans “should” one-box, based on a pre­sump­tion that they have no free will. That’s mak­ing a claim within a paradigm while re­ject­ing the paradigm. It makes no sense.

Con­sider what Eliezer says about coin flips:

We’ve pre­vi­ously dis­cussed how prob­a­bil­ity is in the mind. If you are un­cer­tain about whether a clas­si­cal coin has landed heads or tails, that is a fact about your state of mind, not a prop­erty of the coin. The coin it­self is ei­ther heads or tails. But peo­ple for­get this, and think that coin.prob­a­bil­ity == 0.5, which is the Mind Pro­jec­tion Fal­lacy: treat­ing prop­er­ties of the mind as if they were prop­er­ties of the ex­ter­nal world.

The mind pro­jec­tion fal­lacy is treat­ing the word “prob­a­bil­ity” not in a nom­i­nal­ist way, but in a philo­soph­i­cal re­al­ist way, as if they were things ex­ist­ing in the world. Prob­a­bil­ities are sub­jec­tive. You don’t pro­ject them onto the ex­ter­nal world. That doesn’t make “coin.prob­a­bil­ity == 0.5″ a “false” state­ment. It cor­rectly speci­fies the dis­tri­bu­tion of pos­si­bil­ities given the in­for­ma­tion available within the mind mak­ing the prob­a­bil­ity as­sess­ment. I think that is what Eliezer is try­ing to say there.

“Free will” is a use­ful the­o­ret­i­cal con­struct in a similar way. It may not be a thing in the world, but it is a model for talk­ing about how we make de­ci­sions. We can only model our own brains; you can’t fully simu­late your own brain within your own brain; you can’t de­mand that we use the ter­ri­tory as our map.

• It’s not just the one post, it’s the whole se­quence of re­lated posts.

It’s hard for me to sum­ma­rize it all and do it jus­tice, but it dis­agrees with the way you’re fram­ing this. I would sug­gest you read some of that se­quence and/​or some of the de­ci­sion the­ory pa­pers for a defense of “should” no­tions be­ing used even when be­liev­ing in a de­ter­minis­tic world, which you re­ject. I don’t re­ally want to ar­gue the whole thing from scratch, but that is where our dis­agree­ment would lie.