# Pascal’s Mugging and One-shot Problems

I’ve had some thoughts on Pas­cal’s Mug­ging which might be worth shar­ing. I’m as­sum­ing some fa­mil­iar­ity with Pas­cal’s Mug­ging in this post.

Be­fore we get re­ally started, let’s trans­form Pas­cal’s Mug­ging into a prob­lem that is eas­ier to rea­son about but still gets at the core idea.

First, for­get “util­ity”, for­get money, we’re max­imis­ing pa­per­clips. I know it’s kind of silly, but us­ing money or util­ity re­ally tends to muddy up think­ing. I was sur­prised how much eas­ier it was to rea­son about the prob­lem when I switched to pa­per­clips[0].

Now, here’s my ver­sion of the prob­lem. Sup­pose there’s a cas­ino, in which there is a googol­plex-sided die (10^(10^100) sides). You can pay 10 pa­per­clips to roll the die, which are de­stroyed. If you roll a 1 on the die, the cas­ino man­u­fac­tures 3^^^^3 pa­per­clips. Other­wise, noth­ing else hap­pens and you lost 10 pa­per­clips.

3^^^^3 is much, much, much larger than a googol­plex, so ex­pected util­ity max­imi­sa­tion over­whelm­ingly says you should play at this cas­ino if you’re try­ing to max­imise the num­ber of pa­per­clips.

(From this point on, I will re­fer to “ex­pected util­ity” as “EU”)

In some cases, EU is cor­rect. Here are 3 cases.

Case 1: If you ex­pect to live a googol­plex years, then pay­ing to roll the die a few times per year is a good idea, be­cause over that time frame the prob­a­bil­ity of win­ning at least once is very high.

Case 2: You have a nor­mal lifes­pan, but there are a googol­plex other pa­per­clip max­imisers on Earth. In this case, ev­ery­one plays, and the prob­a­bil­ity that at least one per­son wins is very high.

Case 3: You are the only pa­per­clip max­imiser in this uni­verse, but there are a googol­plex al­ter­nate uni­verses that con­tain al­ter­nate “you”s who are in the same situ­a­tion as you, and for whom the dice rolls are all in­de­pen­dent. In this case the prob­a­bil­ity that at least one “you” wins is high, so you should play.

Here is a differ­ent case.

Case 0: You are alone in ex­is­tence. There is no one else on earth, there are no al­ter­nate re­al­ities, you are liter­ally alone in the en­tirety of all ex­is­tence. This is the only de­ci­sion you will ever get the chance to make. Should you pay 10 pa­per­clips to roll the die?

I think it’s clear in Case 0 that you should not pay. If you pay, you lose 10 pa­per­clips[1] and, for all prac­ti­cal pur­poses, are cer­tain to lose. If you don’t pay, at least you get to keep your 10 pa­per­clips. Since we’re try­ing to max­imise the num­ber of pa­per­clips, the lat­ter wins.

The key differ­ence be­tween case 0 and the other 3 cases is that in case 0, you only get one chance to max­imise pa­per­clips. I’m call­ing this sort of sce­nario a one-shot prob­lem.

A one-shot prob­lem can ba­si­cally be de­scribed as try­ing to max­imise pa­per­clips by choos­ing from a finite set of choices, each choice be­ing mu­tu­ally ex­clu­sive and hav­ing a finite set of out­comes whose prob­a­bil­ity sums up to 1, and each out­come con­tain­ing a finite num­ber of pa­per­clips.

I think a key in­sight, which I don’t know enough to prove but which seems cor­rect, is that any finite se­quence of de­ci­sions can be trans­formed into a one-shot prob­lem, sim­ply by view­ing each pos­si­ble se­quence of de­ci­sions as a sin­gle choice.

As a sim­ple ex­am­ple, if choos­ing be­tween ei­ther flip­ping a coin or rol­ling a die is one de­ci­sion, then flip­ping a coin and then rol­ling a die is 2 de­ci­sions. But they can be com­bined to­gether as a sin­gle choice where you both flip a coin and then sub­se­quently roll a die. This com­bi­na­tion can be done even when the se­quence of de­ci­sions is more com­pli­cated, for ex­am­ple when choos­ing to flip a coin, and, if it’s heads, rol­ling a die, oth­er­wise flip­ping a coin again.

I ex­pect you can do some­thing similar to group the de­ci­sions of al­ter­nate “you”s into one, or even the de­ci­sions of other peo­ple on Earth, so long as their de­ci­sion-mak­ing pro­ce­dure is similar enough to yours.

If I’m right that this trans­for­ma­tion is pos­si­ble, that means one-shot prob­lems are iso­mor­phic to finite multi-shot prob­lems, and hence in­sights into one are ap­pli­ca­ble to the other. This means that a solu­tion to one-shot prob­lems should give a solu­tion to Pas­cal’s Mug­ging in gen­eral.

Solv­ing one-shot prob­lems means find­ing a de­ci­sion-mak­ing pro­ce­dure that max­imises pa­per­clips when you only have one de­ci­sion. One might ex­pect EU, which is all about max­imis­ing pa­per­clips, would at least provide some in­sight into this.

Sur­pris­ingly, EU doesn’t seem to help. The key prop­erty of an EU max­imiser is that as it makes more and more de­ci­sions, the prob­a­bil­ity that it will get more pa­per­clips ap­proaches 1.

For ex­am­ple, in my cas­ino ver­sion of Pas­cal’s Mug­ging, at 1 rep­e­ti­tion there is a very low prob­a­bil­ity of an EU max­imiser win­ning. But at 1 googol­plex rep­e­ti­tions, there is an ~63% chance of it win­ning at least once. At 2 googol­plex, that prob­a­bil­ity be­comes ~86%. In the long run, the prob­a­bil­ity that EU will come out on top ap­proaches 1.

This means that EU com­pletely sidesteps the prob­lem of how to make de­ci­sions un­der un­cer­tainty, by choos­ing the se­quence of de­ci­sions that has a vir­tu­ally 100% prob­a­bil­ity of win­ning out in the long run!

In sum­mary, Pas­cal’s Mug­ging oc­curs be­cause ex­pected util­ity de­pends on there be­ing a long time frame, and solv­ing what to do when there isn’t a long enough time frame is equiv­a­lent to solv­ing it for the sim­pler case where you only get to make a sin­gle de­ci­sion.

Thank you for read­ing this, and I hope to learn a lot from your replies!

[0] Switch­ing to pa­per­clip max­imis­ing also helps show why I think bounded util­ity func­tions are an in­com­plete solu­tion to Pas­cal’s Mug­ging. Which choice is op­ti­mal for max­imis­ing the num­ber of pa­per­clips in the world? This is a seem­ingly fac­tual ques­tion, and our best an­swer is ex­pected util­ity max­imi­sa­tion, which is vuln­er­a­ble to Pas­cal’s Mug­ging. This ques­tion is in­de­pen­dent of our util­ity func­tion, and can’t be re­solved by say­ing that we should use a bounded util­ity func­tion.

Us­ing pa­per­clip max­imi­sa­tion also helps re­move an­thropic prob­lems in Pas­cal’s Mug­ging. You can ar­gue that pro­duc­ing 3^^^^3 util­ity re­quires cre­at­ing 3^^^^3 peo­ple, which means the prob­a­bil­ity that you are one of those 3^^^^3 peo­ple coun­ter­bal­ances the re­ward from be­ing Pas­cal Mugged. But this rea­son­ing does not work if your “util­ity” is pa­per­clips.

[1] Note that the price be­ing 10 pa­per­clips is only a cour­tesy from the cas­ino. They could charge a billion pa­per­clips per roll and EU would still say that you should pay up.

Up­date: On fur­ther re­flec­tion, my crit­i­cism of bounded util­ity func­tions in the ze­roth foot­note is wrong. I’ve up­dated in this di­rec­tion due to Dagon’s sec­ond point in the com­ments (thank you!). Max­imis­ing the num­ber of pa­per­clips can be done us­ing a bounded util­ity func­tion as well, for ex­am­ple the func­tion 1-1/​2^p is bounded be­tween 0 and 1, where p is an non­nega­tive in­te­ger giv­ing the num­ber of pa­per­clips.

That this can be done is sur­pris­ing to me right now, and sug­gests that I need to think some more about all this.

• If you liter­ally max­i­mize ex­pected num­ber of pa­per­clips, us­ing stan­dard de­ci­sion the­ory, you will always pay the cas­ino. To re­fuse the one shot game, you need to have a non­lin­ear util­ity func­tion, or be do­ing some­thing weird like me­dian out­come max­i­miza­tion.

Choose ac­tion A to max­im­ixe m such that P(pa­per­clip count>m|a)=1/​2

A well defined rule, that will be­have like max­i­miza­tion in a suffi­ciently vast mul­ti­verse.

• What do you mean by a suffi­ciently large mul­ti­verse? If your first choice loses many pa­per­clips in 40% of cases and wins’s few in the rest, you would take it and a max­i­mizer wouldn’t.

• If you were truly alone in the mul­ti­verse, this al­gorithm would take a bet that had a 51% chance of win­ning them 1 pa­per­clip, and a 49% chance of loos­ing 1000000 of them.

If in­de­pen­dant ver­sions of this bet are tak­ing place in 3^^^3 par­allel uni­verses, it will re­fuse.

For any finite bet, for all suffi­ciently large If the agent is us­ing TDT and is faced with the choice of whether to make this bet in mul­ti­verses, it will be­have like an ex­pected util­ity max­i­mizer.

• 1) You’re cor­rect that “known finite iter­a­tions” can be treated as “sin­gle-shot” by defin­ing a com­plete strat­egy and not car­ing about in­ter­me­di­ate states. “un­known end­ing con­di­tions” may or may not be re­ducible in this way.

2) You can’t get away from util­ity. You have to define how much bet­ter a uni­verse with X − 10n + 3^^^^3 pa­per­clips is than a uni­verse with X or a uni­verse with X − 10n (where X is start­ing pa­per­clips, n is num­ber of wa­gers you’ll make be­fore giv­ing up or hit­ting the jack­pot).

3) us­ing lu­dicrous num­bers breaks most peo­ple’s in­tu­itions (cf scope in­sen­si­tivity), and you should ex­plain why you don’t use a 100-sided die and a pay­out of a trillion pa­per­clips.

2. Hm, I was work­ing un­der the as­sump­tion that the “util­ity” with pa­per­clips was just the num­ber of pa­per­clips. A uni­verse with X − 10n + 3^^^^3 pa­per­clips is bet­ter than a uni­verse with just X pa­per­clips by 3^^^^3 − 10n. Is this not a proper util­ity func­tion?

3. The cas­ino ver­sion evolved from re­peated al­ter­a­tions to Pas­cal’s Mug­ging, so it re­tained the 3^^^^3 from there. I had writ­ten a para­graph where I men­tioned that for one-shot prob­lems, even a more re­al­is­tic prob­a­bil­ity could qual­ify as a Pas­cal’s Mug­ging, though I had used a 1/​mil­lion chance of a trillion pa­per­clips in­stead of 1100. I ended up edit­ing that para­graph out, though.

Work­ing with a 1100 prob­a­bil­ity, it’s less ob­vi­ously a bad idea to pay up, of course. I don’t know where to draw the line be­tween “this is a Pas­cal’s Mug­ging” and “this is good odds”, so I’m less con­fi­dent that you shouldn’t pay up for a 1100 prob­a­bil­ity. I think it be­comes a more ob­vi­ously bad idea if we up the price of the cas­ino, for ex­am­ple to 1 mil­lion pa­per­clips. This still gives pos­i­tive EU to pay­ing, but has a fairly steep price com­pared to do­ing noth­ing un­less you get pretty lucky.

Look­ing back, I think that one of the fac­tors in my de­ci­sion to re­tain such lu­dicrous num­bers was that it seemed more per­sua­sive. I apol­o­gise for this.

All that be­ing said, thank you very much for your re­ply!

• I think your anal­y­sis of “max­imise” just com­pares x>y with­out re­gard how much big­ger x is which is kind of a nat­u­ral con­se­quence for sub­tract­ing ex­pected util­ity out. How­ever it does high­light that if our goal is “max­imise pa­per­clips” it doesn’t re­ally say whether “win harder” is rele­vant or not. That is 2>1 but so is 1000>1. So for cases when an out­come is not a con­stant amount of pa­per­clips we need more rules than what the ob­ject of at­ten­tion is. So a pa­per­clip max­imiser is ac­tu­ally un­der­speci­fied.

• Very in­ter­est­ing, thank you!

I think “max­imis­ing” still makes sense in one-shot prob­lems. 2>1 and 1000>1, but it’s also the case that 1000>2, even with­out ex­pected util­ity. The way I see it, EU is a method of com­par­ing choices based on their av­er­age util­ity, but the “av­er­age” turns out to be a less use­ful met­ric when you only have one chance.

So for cases when an out­come is not a con­stant amount of pa­per­clips we need more rules than what the ob­ject of at­ten­tion is. So a pa­per­clip max­imiser is ac­tu­ally un­der­speci­fied.

If this is true, it would im­ply that in a one-shot prob­lem, a util­ity func­tion is not enough on its own to de­ter­mine what is the “op­ti­mal” choice when you want to “max­imise” (get the high­est value you can) on that util­ity func­tion. This would be a pretty big re­sult, I think.

I think that if there is a part that is un­der­speci­fied, though, it’s not the pa­per­clip max­imiser, but the word “op­ti­mal”. What does it mean for a choice to be “op­ti­mal” rel­a­tive to other choices, when it might turn out bet­ter or worse de­pend­ing on luck? I haven’t been able to an­swer that ques­tion.

• Many times opinions how to han­dle un­cer­tainty get baked into the util­ity func­tions. That is a stan­dard naive con­struc­tion is to say “be risk neu­tral” and value pa­per­clips lin­early for their amount. But I could imag­ine a policy for which more pa­per­clips is always bet­ter but from a de­fault po­si­tion of 100% 2 pa­per­clips it wouldn’t choose a op­tion of 0.1% 1 pa­per­clips, 49.9% 2 pa­per­clips and 50% 3 pa­per­clips. One can con­struct a “risk averse” func­tion where the new func­tion can sim­ply be op­ti­mised. But does it re­ally mean the new func­tion is not a pa­per clip max­i­ma­tion func­tion?

• You’re ab­solutely right. I was start­ing to get at this idea from an­other of the com­ments, but you’ve laid out where I’ve gone wrong very clearly. Thank you.

• Are you re­ject­ing Pas­cal’s mug­ging be­cause of the prospect of rely­ing on un­cer­tain mod­els that you do not ex­pect to con­firm?

Is all your in­tu­ition cap­tured by max­i­miz­ing util­ity over all but the ex­treme billionth of the dis­tri­bu­tion?

Here’s a one-shot prob­lem for your in­tu­ition to an­swer: You get to de­sign the prob­a­bil­ity dis­tri­bu­tion to draw the num­ber of pa­per­clips from, ex­cept that its ex­pec­ta­tion must be at most its nega­tive kol­go­morov com­plex­ity. What dis­tri­bu­tion makes for a good choice?

• Thank you for your re­sponse!

Are you re­ject­ing Pas­cal’s mug­ging be­cause of the prospect of rely­ing on un­cer­tain mod­els that you do not ex­pect to con­firm?

My in­tu­ition is that in a one-shot prob­lem, gam­bling ev­ery­thing on an ex­tremely low prob­a­bil­ity event is a bad idea, even when the re­ward from that low prob­a­bil­ity event is very high, be­cause you are effec­tively cer­tain to lose. This is the ba­sis for me not pay­ing up in Pas­cal’s Mug­ging and in the cas­ino prob­lem in the post.

I’m try­ing to keep my rea­son­ing sim­ple, so in my ex­am­ples I always as­sume that there are no in­fini­ties, no un­known un­knowns, ev­ery out­come of ev­ery choice is statis­ti­cally in­de­pen­dent, and all the as­signed prob­a­bil­ities are statis­ti­cally cor­rect (if there is a 16 chance of an out­come and you get to re­peat the prob­lem, you will get that out­come on av­er­age 16 of the time).

Is all your in­tu­ition cap­tured by max­i­miz­ing util­ity over all but the ex­treme billionth of the dis­tri­bu­tion?

Hon­estly, I have no idea how to solve the prob­lem. My in­tu­ition is hope­lessly mud­dled on this, and ev­ery idea I’ve been able to come up with seems flawed, in­clud­ing the one you’ve just asked about.

Here’s a one-shot prob­lem for your in­tu­ition to an­swer: You get to de­sign the prob­a­bil­ity dis­tri­bu­tion to draw the num­ber of pa­per­clips from, ex­cept that its ex­pec­ta­tion must be at most its nega­tive kol­go­morov com­plex­ity. What dis­tri­bu­tion makes for a good choice?

My first thought is 1/​googol­plex chance of los­ing 3^^^^3 pa­per­clips, and the rest of the prob­a­bil­ity giv­ing as many pa­per­clips as the kol­mogorov com­plex­ity con­straint al­lows. I could do bet­ter by in­creas­ing the prob­a­bil­ity of the loss, for ex­am­ple 1/​googol would be a bet­ter prob­a­bil­ity. How­ever, I have no idea where to draw the line, at what point it stops be­ing a good idea to in­crease the prob­a­bil­ity.

• In a mar­ket of bet­tors that draw the line of how much risk to take at differ­ent points, the early game will be dom­i­nated by the most risk-tak­ing folks and as the game grows older, the line that was cho­sen by the cur­rent win­ners moves. Per­haps your in­tu­ition is merely the product of evolu­tion play­ing this game for as long as it took for the line to reach its cur­rent point?