• If zero and one aren’t prob­a­bil­ities, how does Bayesian con­di­tion­ing work? My un­der­stand­ing is that a Bayesian has to be cer­tain of the truth of what­ever propo­si­tion that she con­di­tions on when up­dat­ing.

• Zero and one are prob­a­bil­ities. The ap­par­ent op­po­site claim is a hy­per­bole in­tended to com­mu­ni­cate some­thing else, but peo­ple on LessWrong per­sis­tently make the mis­take of tak­ing it liter­ally. For ex­am­ples of 0 and 1 ap­pear­ing un­avoid­ably in the the­ory of prob­a­bil­ity, P(A|A) =1 and P(A|-A)=0. If some­one dis­putes ei­ther of these for­mu­lae, the onus is on them to re­build prob­a­bil­ity the­ory in a way that avoids them. As far as I know, no-one has even at­tempted this.

But P(A|B) = P(A&B)/​P(B) for any pos­i­tive value of P(B). You can con­di­tion on ev­i­dence all day with­out ever need­ing to as­sert a cer­tainty about any­thing. Your con­clu­sions will all be hy­po­thet­i­cal, of the form “if this is the prior over A and this B is the ev­i­dence, this is the pos­te­rior over A”. If the ev­i­dence is un­cer­tain, this can be in­cor­po­rated into the calcu­la­tion, giv­ing con­clu­sions of the form “given this prior over A and this prob­a­bil­ity dis­tri­bu­tion over pos­si­ble ev­i­dence B, this is the pos­te­rior over A.”

If you are un­cer­tain even of the prob­a­bil­ity dis­tri­bu­tion over B, then a hard-core Bayesian will say that that un­cer­tainty is mod­el­led by a dis­tri­bu­tion over dis­tri­bu­tions of B, which can be folded down into a dis­tri­bu­tion over B. Soft-core Bayesi­ans will scoff at this, and turn to magic, a.k.a. model check­ing, hu­man un­der­stand­ing, etc. Hard-core Bayesi­ans will say that these only work to the ex­tent that they ap­prox­i­mate to Bayesian in­fer­ence. Soft-core Bayesi­ans aren’t listen­ing at this point, but if they were they might challenge the hard-core Bayesi­ans to pro­duce an ac­tual method that works bet­ter.

• My un­der­stand­ing is that a Bayesian has to be cer­tain of the truth of what­ever propo­si­tion that she con­di­tions on when up­dat­ing.

This isn’t nec­es­sary. In many cir­cum­stances, you can ap­prox­i­mate the prob­a­bil­ity of an ob­ser­va­tion you’re up­dat­ing on to 1, such as an ob­ser­va­tion that a coin came up heads. An ob­ser­va­tion never liter­ally has a prob­a­bil­ity of 1 (you could be hal­lu­ci­nat­ing, or be a brain in a jar, etc.) Some­times ob­ser­va­tions are un­cer­tain enough that you can’t ap­prox­i­mate them to 1, but you can still do the math to up­date on them (“Did I re­ally see a mouse? I might have imag­ined it. Up­date on .7 prob­a­bil­ity ob­ser­va­tion of mouse.”)

• Yeah, but if your ob­ser­va­tion does not have a prob­a­bil­ity of 1 then Bayesian con­di­tion­al­iza­tion is the wrong up­date rule. I take it this was Alex’s point. If you up­dated on a 0.7 prob­a­bil­ity ob­ser­va­tion us­ing Bayesian con­di­tion­al­iza­tion, you would be vuln­er­a­ble to a Dutch book. The cor­rect up­date rule in this cir­cum­stance is Jeffrey con­di­tion­al­iza­tion. If P1 is your dis­tri­bu­tion prior to the ob­ser­va­tion and P2 is the dis­tri­bu­tion af­ter the ob­ser­va­tion, the up­date rule for a hy­poth­e­sis H given ev­i­dence E is:

P2(H) = P1(H | E) P2(E) + P1(H | ~E) P2(~E)

If P2(E) is suffi­ciently close to 1, the con­tri­bu­tion of the sec­ond term in the sum is neg­ligible and Bayesian con­di­tion­al­iza­tion is a fine ap­prox­i­ma­tion.

• This is a strange dis­tinc­tion, Jeffrey con­di­tion­al­iza­tion. A lit­tle google search­ing shows that some­one got their name added to con­di­tion­ing on E and ~E. To me that’s just a straight ap­pli­ca­tion of prob­a­bil­ity the­ory. It’s not like I just fell off the turnip truck, but I’ve never heard any­one give this a name be­fore.

To get a marginal, you con­di­tion on what you know, and sum across the other things you don’t. I dis­like the end­less mul­ti­pli­ca­tion of terms for spe­cial cases where the gen­eral form is clear enough.

• I dis­like the end­less mul­ti­pli­ca­tion of terms for spe­cial cases where the gen­eral form is clear enough.

I don’t know. i like hav­ing names for things. Makes it eas­ier to re­fer to them. And to be fair to Jeffrey, while the up­date rule it­self is a triv­ial con­se­quence of prob­a­bil­ity the­ory (as­sum­ing the con­di­tional prob­a­bil­ities are in­var­i­ant), his rea­son for ex­plic­itly ad­vo­cat­ing it was the im­por­tant episte­molog­i­cal point that ab­solute cer­tainty (prob­a­bil­ity 1) is a sort of de­gen­er­ate epistemic state. Think of his name be­ing at­tached to the rule as recog­ni­tion not of some new piece of math but of an in­sight into the na­ture of knowl­edge and learn­ing.

• If you ob­serve X then the thing you up­date on is “I ob­served X” and not just “X”. Just be­cause you ob­served some­thing doesn’t mean it was nec­es­sar­ily the case (you could be hal­lu­ci­nat­ing etc.). So while you don’t as­sign prob­a­bil­ity 1 to “X” you do as­sign prob­a­bil­ity 1 to “I ob­served X”, which is fine.

• Peo­ple here seem con­fi­dent that there ex­ists a de­ci­sion the­ory im­mune to black­mail. I see a large amount of dis­cus­sion of how to make an AI im­mune to black­mail, but I’ve never seen it es­tab­lished (or even ar­gued for) that do­ing so is pos­si­ble. I think I missed some­thing vi­tal to these dis­cus­sions some­where. Could some­one point me to it, or ex­plain here?

• I’m not aware of a satis­fac­tory treat­ment of black­mail (in the con­text of re­flec­tive de­ci­sion the­ory). The main prob­lem ap­pears to be that it’s not clear what “black­mail” is, ex­actly, how to for­mally dis­t­in­guish black­mail from trade.

• I think black­mail can be taken as a form of threat. To use Schel­ling’s defi­ni­tion (Strat­egy of Con­flict), a threat has the prop­erty that af­ter the per­son be­ing threat­ened fails to perform the speci­fied ac­tion, the threat­ener does not want to carry out the threat any more. In other words, the threat­ener has a cred­i­bil­ity prob­lem: he has to con­vince the tar­get of the threat that he will carry it out once he de­sires not to. This re­quires some form of pre-com­mit­ment, or an iter­ated game, or a suc­cess­ful bluff, or some­thing along those lines.

What do you see as the dis­t­in­guish­ing differ­ence be­tween black­mail and a threat? (I as­sume that black­mail is a sub­set of threats, but I sup­pose that might not be uni­ver­sally agreed.)

• Coun­terex­am­ple:

A man se­duces a fe­male movie star into a one night stand and se­cretly records a sex tape. He would pre­fer to black­mail the movie star for lots of money, but if that fails he would rather re­lease the tape to the press for a smaller amount of money + pres­tige than he would just do noth­ing. The movie star’s prefer­ence or­der­ing is for noth­ing to hap­pen, for her to pay out, then lastly for the press to find out. The op­ti­mal choice is for her to pay out, be­cause if she pre-com­mits to not give in to black­mail, she will re­ceive the worst pos­si­ble out­come.

This seems to fall squarely un­der black­mail, yet re­quires no pre-com­mitt­ment, iter­a­tion, or bluffing.

• This seems to fall squarely un­der black­mail, yet re­quires no pre-com­mitt­ment, iter­a­tion, or bluffing.

It does, mak­ing ‘black­mail’ the wrong term to use when con­sid­er­ing game the­ory sce­nar­ios. Some are ‘threats’, some are sim­ply trade.

• I would say that black­mail is the in­ter­sec­tion of {things that are in­dis­t­in­guish­able from threats} and {things that are in­dis­t­in­guish­able from trade}. And yes, from the per­spec­tive of the black­mailers that in­cludes both some things that are trade and some that are threats.

• Yep. Even with­out the pay­off—the black­mail im­mune agent is go­ing to in­ter­act poorly with stupid black­mailer agent which sim­ply doesn’t un­der­stand that the black­mail im­mune agent won’t pay. Or with evil black­mailer agent that just de­rives pos­i­tive util­ity from your mis­for­tune, which is the case with most black­mailers in prac­tice.

The win­ning strat­egy de­pends to the ecosys­tem. The win­ning strat­egy among defect is defect, but among mixed tit-for-tat works great. The de­ci­sion sys­tems just tend to con­verge to a sort of self fulfilling prophecy solu­tions.

edit: i.e. there’s a rock-pa­per-skissors situ­a­tion be­tween differ­ent agents, they are not well or­dered in ‘bet­ter­ness’.

• Ac­tu­ally, I’m not sure this does fall squarely un­der black­mail.

Con­sider the case where some­one has a tape I don’t want shown to the press, and sells that tape to the press for money + pres­tige, and never gives me any choice in the mat­ter. That’s clearly not black­mail. I’m not sure it be­comes black­mail when they give me a choice to pay them in­stead, though the case could be made.

Or con­sider the case where it turns out I don’t mind hav­ing the tape shown (I want the pub­lic­ity, say), and so the per­son sells the tape to the press, and ev­ery­one gets what they want. Also not black­mail. Not even clearly at­tempted black­mail, though the case could be made.

My point be­ing that it seems to me that for me to le­gi­t­i­mately call some­thing “black­mail” it needs to be some­thing the black­mailer threat­ens to do only be­cause it makes me suffer more than pay­ing them, not some­thing that the black­mailer wants to do any­way for his own rea­sons that just hap­pens to make me suffer.

• I dis­agree that the es­sen­tial el­e­ment to black­mail is it must be done only to make me suffer. To this end I offer a sce­nario. (I’ve made it a lit­tle more like a story just for gig­gles).

You’ve just won the lot­tery, and the TV peo­ple in­ter­viewed you and your wife. Hur­ray! Shortly af­ter, An­gela your mis­tress calls you up.
”Con­grat­u­la­tions. I saw on TV that you won the lot­tery… I also saw you had a wife. Things are over be­tween us!”
“An­gela. I’m sorry for ly­ing baby, but you’ve got a hus­band.”
”Ir­rele­vant. But, you know, you’re rich now. How about you give me $4,000 a week for the next ten years, and I don’t tell your wife.” “Oh come on, you’ve got a hus­band!” ”Yeah, and you’re rich. I ex­pect the money in a week.” click­ Well that’s what you get for cheat­ing. Sud­denly, Ju­lia, your other mis­tress calls you up. ”Con­grat­u­la­tions. I saw on TV that you won the lot­tery… I also saw you had a wife. Things are over be­tween us!” “Ju­lia. I’m sorry for ly­ing baby.” ”Fuck you. I should tell her right now. But, you know, you’re rich now. How about you give me$4,000 a week for the next ten years, and I don’t tell your wife.”
“Oh come on, not again”
”What’s that mean? Be­sides, you’re rich. I ex­pect the money in a week.” click­
Way to go, two timer. How­ever, un­for­tu­nately for you, the TV show calls you up and lets you know that there was a print­ing er­ror and you did NOT in fact win the lot­tery. You quickly call your mis­tresses back.
”An­gela, baby, turns out I didn’t win the lot­tery. I can’t pos­si­bly pay you.”
“Don’t baby me. Well, I guess I won’t tell your wife then as long as you don’t tell my hus­band. I don’t want any­one to know this.”
”Thanks An­gela, you’re the best.”
Hur­ray, bul­let dodged!
”Ju­lia, baby, turns out I didn’t win the lot­tery. I can’t pos­si­bly pay you.”
“Don’t baby me. Well, I guess I’ll tell your wife then ass­hole. Since you won’t pay me I’m gonna go post on face­book now!”
”But, I can’t pay you be­cause the money never be­longed to me.”
″Ir­rele­vant. Sucks to be you!”

I ar­gue that both of them were at­tempt­ing to black­mail and Ju­lia’s de­sire to fol­low through with it any­ways doesn’t change any­thing. The ac­tions would both feel like black­mail to me if I were on the re­ceiv­ing end, and the po­lice would treat both of them as black­mail as well. Black­mail is just an at­tempt to get money in ex­change for not re­leas­ing in­for­ma­tion; the mind­set of the black­mailer does not af­fect it. This is why I agree with Vladimir Nesov, that black­mail ex­ists in a blurry spot on the con­tinuum of trade.

If you don’t clas­sify Ju­lia’s ac­tions as black­mail, I would be cu­ri­ous what you do call it.

• I clas­sify Ju­lia’s ac­tions as in­con­sis­tent, mostly.

At time T1, Ju­lia prefers to date me rather than end our re­la­tion­ship and tell my wife.
At time T2, Ju­lia prefers to end our re­la­tion­ship and tell my wife.
The tran­si­tion be­tween T1 and T2 ev­i­dently has some­thing to do with the tran­sient be­lief that her silence was worth $4k/​week, but what ex­actly it has to do with that be­lief is un­clear, since by Ju­lia’s own ac­count the truth or false­hood of that be­lief is ir­rele­vant. If I take her ac­count as defini­tive, I’m pretty clear that what Ju­lia is do­ing is not black­mail… it re­duces to “Hey, I’ve de­cided to tell your wife about us, and there’s noth­ing you can do to stop me.” It isn’t even a threat, it’s just early warn­ing of the in­tent to harm me. If I as­sume she’s ly­ing about her mo­tives, ei­ther con­sciously or with some de­gree of self-delu­sion, it might be black­mail. For ex­am­ple, if she be­lieves I re­ally can af­ford to pay her, and am just claiming poverty as a ne­go­ti­at­ing tac­tic, which she is coun­ter­ing by claiming not to care about the money, then it fol­lows that she’s black­mailing me. If I as­sume that she doesn’t re­ally have rele­vant mo­tives any­more, she just pre­com­mit­ted to re­veal the in­for­ma­tion if I don’t pay her and now she’s fol­low­ing through on her pre­vi­ous pre­com­mit­ment, and the fact that the pre­com­mit­ment was made based on one set of be­liefs about the world and she now knows those be­liefs were false at the time doesn’t change the fact that the pre­com­mit­ment was made (“of­ten wrong, never un­cer­tain”), then she clearly black­mailed me once, and I guess it fol­lows that she’s still black­mailing me… maybe? It seems that if she set up a me­chan­i­cal de­vice that posts the se­cret to Face­book un­less fed$4k in quar­ters once a week, and then changed her mind and de­cided she’d rather just keep dat­ing me, but was un­able to turn the de­vice off, we could in the same sense say she was still black­mailing me, albeit against her own will. That is at best a prob­le­matic sense of black­mail, but not clearly an in­cor­rect one.

So it seems pretty clear that the black­mailer’s in­ten­tions play some role in my in­tu­itions about what is or isn’t black­mail, albeit a murky one.

Of course, I could choose to ig­nore my lin­guis­tic in­tu­itions and adopt a sim­pler defi­ni­tion which I ap­ply for­mally. Noth­ing wrong with that, but it makes ques­tions about what is and isn’t black­mail sort of silly.

For ex­am­ple, if I say any at­tempt to get money in ex­change for not re­veal­ing in­for­ma­tion, re­gard­less of my state of mind, is black­mail, then the fol­low­ing sce­nario is clearly black­mail:

I de­velop a prac­ti­cal, cheap, un­limited en­ergy source in my base­ment. Ju­lia says “Honey, I work for the oil com­pany, and we will pay you $N/​week for the rest of your life if you keep your mouth shut about that en­ergy source.” I agree and take the money. My na­tive speaker’s in­tu­itions are very clear that this is not black­mail, but I’m happy to use the term “black­mail” as a term of art to de­scribe it if that makes com­mu­ni­ca­tion with you (and per­haps Vladimir Nesov) eas­ier. • The tran­si­tion be­tween T1 and T2 ev­i­dently has some­thing to do with the tran­sient be­lief that her silence was worth$4k/​week

Sorry. I think I com­mu­ni­cated un­clearly, which is the dan­ger of us­ing sto­ries in­stead of ex­am­ples and is my fault en­tirely. At the very start of the story, Ju­lia learns about your wife at the same time she learns about the lot­tery. She had pre­vi­ously thought you were sin­gle and the new in­for­ma­tion shifted her prefer­ence or­der­ing.

Re­gard­ing the ex­am­ple you used (oil com­pany & en­ergy), I also hold it is not black­mail. If I use the pre­vi­ous defi­ni­tion of Black­mail be­ing the act of mak­ing an at­tempt to get money in ex­change for not re­veal­ing in­for­ma­tion, then the at­tempt is the cru­cial part in this case (whether it suc­ceeds or not). The oil com­pany offer­ing me money is okay; me try­ing to get money out of the oil com­pany is black­mail.

• The oil com­pany offer­ing me money is okay; me try­ing to get money out of the oil com­pany is black­mail.

And also some­times okay. The dis­tinc­tion isn’t “okay” vs black­mail. It is black­mail vs not-black­mail and “okay” vs not-okay.

• (nods) As noted el­se­where, I missed this and was en­tirely mis­taken about Ju­lia’s mo­tives. I stand cor­rected. You were perfectly clear, I just wasn’t read­ing at­ten­tively enough.

Re: black­mail… OK. So, if I de­velop the tech­nol­ogy and I ap­proach the oil com­pany and say “I have this tech­nol­ogy, I’ll guaran­tee you ex­clu­sive rights to it for $N/​week,” that’s black­mail? • I’d say it’s much closer to black­mail than the origi­nal oil com­pany sce­nario. • I sup­pose I agree with that, but I wouldn’t call ei­ther of them black­mail. Would you? • At time T1, Ju­lia prefers to date me rather than end our re­la­tion­ship and tell my wife. I think in Xachariah’s story Ju­lia did not know prior to see­ing you on TV that you have a wife. So there was no time at which she had the prefer­ence you de­scribe here. • Ah! Good point, I for­got about that. You’re ab­solutely right… through­out, she pre­sum­ably prefers to break up with me than date me if I’m mar­ried. My er­ror. • At time T1, Ju­lia prefers to date me rather than end our re­la­tion­ship and tell my wife. Some­thing about this con­fused me. • Not sure if this is what con­fused you or not, but it has since been pointed out to me that I was wrong; Ju­lia does not nec­es­sar­ily (and ought not be un­der­stood to) have this prefer­ence, as she did not know about my wife at T1. • Not sure if this is what con­fused you or not No, it was just you talk­ing about your wife in first per­son! :) • Ah. Well, my hus­band had one once, I sup­pose I might feel left out. • If we avoid the over­loaded term “black­mail” and talk of threats vs. trade, An­gela is threat­en­ing you whereas Ju­lia is offer­ing a trade. I agree that this ex­am­ple shows that “makes you suffer” is not the dis­t­in­guish­ing el­e­ment. It’s also in­ter­est­ing that you may not now if the situ­a­tion is threat or trade (you may not know whether the mis­tress wants to tell your wife any­way). • I’m not sure how threats and trade are a real di­chotomy rather than two fuzzy cat­e­gories. Sup­pose I buy food. That’s ba­sic trade. But at the same time a monopoly could raise the price of food a lot, and I would still have to buy it, and now it is the threat of star­va­tion. I can go fancy(N), and say, I won’t pay more than X for food, I would rather starve to death and then they get no more of my money, and if I can make it cred­ible, and if the monopoly rea­sons in fancy(N-1) man­ner, they won’t raise the price above X be­cause I won’t pay, but if monopoly rea­sons in the fancy(N) man­ner, it does ex­act same rea­son­ing and con­cludes that it should ig­nore my threat to starve my­self to death and not pay. Most hu­man agents seem to be tit for tat and mir­ror what ever you are do­ing, so if you are rea­son­ing “i’ll just starve my­self to death not to pay” they rea­son like “i’ll just raise the price re­gard­less and the hell with what he does not pay”. The black­mail re­sis­tant agent is also black­mail re­sis­tance re­sis­tant. • I’m not sure how threats and trade are a real di­chotomy rather than two fuzzy cat­e­gories. This is my po­si­tion as well, black­mail prob­a­bly doesn’t need to be con­sid­ered as a sep­a­rate case, rea­son­able be­hav­ior in such cases will prob­a­bly just fall out from a suffi­ciently savvy bar­gain­ing al­gorithm. • I agree with this, in­ci­den­tally. • Good point; hag­gling is a good ex­am­ple of a fuzzy bound­ary be­tween threats and trade. If A is will­ing to sell a wid­get for any price above$10, and B is will­ing to buy a wid­get for any price be­low $20, and there are no other buy­ers or sel­l­ers, then for any price X strictly be­tween$10 and $20, A say­ing “I won’t sell for less than X” and B say­ing “I won’t sell for more than X” are both threats un­der my model. Which means that agents that “naively” pre­com­mit to never re­spond to any threats (the way I un­der­stand them) will not reach an agree­ment when hag­gling. They’ll also fail at the Ul­ti­ma­tum game. So there needs to be a bet­ter model for threats, pos­si­bly one that takes shel­ling points into ac­count; or maybe there should be a spe­cial cat­e­gory for “the kind of threats it’s benefi­cial to pre­com­mit to ig­nore”. • Hmm the pre-com­mit­ment to ig­nore would de­pend on other agents and their pre-pre-com­mit­ment to ig­nore pre-com­mit­ments. It just goes re­cur­sive like Sher­lock Holmes vs Mo­ri­arty, and when you go meta and try to look for ‘limit’ of re­cur­sion, it goes re­cur­sive again… i have a feel­ing that it is in­her­ently a rock-pa­per-skissors situ­a­tion where you can’t cheat like this robot. (I.e. I would sug­gest, at that point, to try to make a bunch of proofs of im­pos­si­bil­ity to nar­row ex­pec­ta­tions down some­what). • It’s not pos­si­ble to co­or­di­nate in gen­eral against ar­bi­trary op­po­nents, like it’s im­pos­si­ble to pre­dict what an ar­bi­trary pro­gram does, but it’s ad­van­ta­geous for play­ers to even­tu­ally co­or­di­nate their de­ci­sions (on some meta-level of pre­com­mit­ment). On one hand, play­ers want to set prices their way, but on the other they want to close the trade even­tu­ally, and this trade­off keeps the out­come from both ex­tremes (“un­fair” prices and im­pos­si­bil­ity of trade). Play­ers have an in­cen­tive to setup some kind of Loe­bian co­op­er­a­tion (as in these posts), which stops the go-meta regress, al­though each will try to set the point where co­op­er­a­tion hap­pens in their fa­vor. • I was think­ing rather of Halt­ing Prob­lem—like im­pos­si­bil­ity, along with rock-pa­per-skissors situ­a­tion that pre­vents declar­ing any one strat­egy, even the co­op­er­a­tive, as the ‘best’. • If difficulty of se­lect­ing and im­ple­ment­ing a strat­egy is part of the trade­off (so that more com­pli­cated strate­gies count as “worse” be­cause of their difficulty, even if they promise an oth­er­wise su­pe­rior out­come), maybe there are “best” strate­gies in some sense, like there is a biggest nat­u­ral num­ber that you can ac­tu­ally write down in 30 sec­onds. (Such things would of course have the char­ac­ter of par­tic­u­lar de­ci­sions, not of de­ci­sion the­ory.) • There is not a biggest nat­u­ral num­ber that you can ac­tu­ally write down in thirty sec­onds—that’s equiv­a­lent to Berry’s para­dox. • Huh? Just start writ­ing. The rule wasn’t “the num­ber you can define in 30 sec­onds”, but sim­ply “the num­ber you can write down in 30 sec­onds”. Like the num­ber of straw­ber­ries you can eat in 30 sec­onds, no para­dox there! • I was read­ing “write down” more gen­er­ally than “write down each digit of in base ten,” but I guess that’s not how you meant it. • Hmm if it was a pro­gram­ming con­test I would ex­pect non-tran­si­tive ‘bet­ter­ness’. • Given a fixed state of knowl­edge about pos­si­ble op­po­nents and finite num­ber of fea­si­ble op­tions for your de­ci­sion, there will be max­i­mal de­ci­sions, even if in an iter­ated con­test the play­ers could cy­cle their de­ci­sions against up­dated op­po­nents in­definitely. • That’s a promise, not a threat, by Schel­ling’s ter­minol­ogy. Once the movie start up­holds her end of the bar­gain, the man has no in­cen­tive to keep his promise, and ev­ery in­cen­tive to break it. Is there some­thing game-the­o­retic about black­mail that makes it an iden­ti­fi­able sub­set of the group threats + promises? Note that Schel­ling also de­scribes a ne­go­ti­at­ing po­si­tion that is nei­ther threat nor promise, but the com­bi­na­tion of the two. And that’s not ex­actly black­mail ei­ther. I sus­pect you could come up with a black­mail sce­nario that fits any of the three group­ings. • That’s not black­mail at all. It seems like black­mail be­cause of the ques­tion­able moral­ity of sel­l­ing se­cretly recorded sex tapes, but giv­ing the movie star the chance to buy the tape first doesn’t make the whole thing any less moral than it would be with­out that chance, and un­like real black­mail the movie star be­ing known not to re­spond to black­mail doesn’t help in any way. Con­sider this vari­a­tion: In­stead of a se­cret tape the movie star vol­un­tar­ily par­ti­ci­pated in an am­a­teur porno that was in­tended to be pub­li­cly re­leased from the be­gin­ning, but held up for some rea­son, and all that hap­pened be­fore the movie star be­came fa­mous in the first place. The pro­ducer knows that re­leas­ing the tape will hurt her ca­reer and offers her to buy the tape to pre­vent it from be­ing re­leased. This doesn’t seem like black­mail at all, and the only change was to the moral (and le­gal) sta­tus of re­leas­ing the tape, not to the trade. • I still clas­sify it as black­mail. Some­thing similar to this hap­pened to Cameron Diaz al­though the rights to re­sell the pho­tos were ques­tion­able. She posed topless in some bondage shots for a mag­a­z­ine, but they were never printed. The pho­tog­ra­pher kept the shots and the record­ing of the photo shoot for ten years un­til one of the Char­lie’s An­gel’s films was about to come out. He offered them to her for a cou­ple of mil­lion or he would sell them to the high­est bid­der. The courts didn’t buy that he was just offer­ing her first right of re­fusal and sen­tenced him for at­tempted grand theft (black­mail), forgery, and per­jury (for mod­ify­ing re­lease forms and ly­ing about it). Link • Are you sure you aren’t just pat­tern match­ing to similar­ity to known types of black­mail? Do you think it would be use­ful for an AI to clas­sify it the same way (which was the start­ing point of this thread)? Your link doesn’t go into much de­tail, but it seems like he was con­victed be­cause he was ly­ing and mak­ing up the nega­tive con­se­quences he threat­ened her with, and like he was go­ing out of his way to make the con­se­quences of sel­l­ing to some­one else as bad as pos­si­ble rather than max­i­miz­ing rev­enue (or at least mak­ing her be­lieve so). That would qual­ify this case as black­mail un­der the defi­ni­tion above, un­like ei­ther of our hy­po­thet­i­cal ex­am­ples. • But you’re for­get­ting the man’s best op­tion: get lots of money from the movie star and get a smaller amount from the press. Edit: Ah, not lump-sum pay­ment. I can see how that would work then. • But you’re for­get­ting the man’s best op­tion: get lots of money from the movie star and get a smaller amount from the press. And be a lot more vuln­er­a­ble to crim­i­nal charges for the black­mail. • I’m as­sum­ing that the movie star is at least rea­son­ably smart. The first thing that comes to mind is pe­ri­odic pay­ments that de­crease over time, with the value hov­er­ing just above what mag­a­z­ines are will­ing to pay + brag­ging rights, since peo­ple are less im­pressed with a 10 year old sex tape than a brand new one. Even­tu­ally the pay­ments would be stopped when ei­ther the man has more to lose from re­leas­ing the tape than stay­ing quiet (eg, he’s set­tled down and mar­ried now) or the movie star val­ues money more than the loss of pres­tige from a scan­dal (eg, an­other scan­dal breaks or she stops get­ting roles any­way). I’m sure there are other ways to solve the prob­lem as well, but re­gard­less it’s a tech­ni­cal hur­dle rather than an ab­solute one. • Con­tinuum be­hav­iors are dis­cussed in some de­tail by Schel­ling, and in­ter­est­ingly they can be used by both par­ties. Here they make the black­mail more effec­tive. If the pay­ment is lump-sum, the black­mailer can’t be trusted, and so the movie start won’t pay. The con­tin­u­ous pay­ment op­tion gives her a way to pay the black­mailer and ex­pect him to stay quiet, which makes her more vuln­er­a­ble to black­mail in the first place. Con­tin­u­ous op­tions can also be used to de­rail threats, when the per­son be­ing threat­ened can act in­cre­men­tally and there is no bright line to force the ac­tion (as­sum­ing the threat is to carry out a sin­gle ac­tion). • Peo­ple here seem con­fi­dent that there ex­ists a de­ci­sion the­ory im­mune to black­mail. Really? I hope not. Sound like a very silly idea to me. Does that just mean un­re­spon­sive to co­er­cion? Pre­tend­ing like you never heard it? Always re­fus­ing to give in to a threat? I sup­pose one could pro­gram that into an AI, and when some­one says to the AI, “your money or your life”, we’ll get a dead AI. Why not just pro­gram your AI to self de­struct when threat­ened? Some­times, knuck­ling un­der is the right call. • ...and some­times choos­ing to die is. • Some­times. As long as it’s only some­times, maybe you’re bet­ter off not always driv­ing off the cliff. I see some peo­ple recom­mend­ing broad­cast­ing your pre­com­m­mit­ment to not knuckle un­der to black­mail. Not so good when you run into a black­mailer with a pre­com­mit­ment to always fol­low through. And not so good when the other guy isn’t cer­tain about your pre­com­mit­ment. Or if he is just a spite­ful prick. Never un­der­es­ti­mate the power of spite. • Book sug­ges­tion: Schel­ling’s Strat­egy of Con­flict. Im­mu­nity to black­mail comes from the po­ten­tial black­mailers know­ing that you wouldn’t give in to their de­mands, so they don’t do it. Many de­ci­sion-mak­ing pro­ce­dures can be con­structed that have this prop­erty—for ex­am­ple, you could just make a “don’t give in bot” that never gives in to black­mail, and dis­tribute its source code to would-be black­mailers. • Peo­ple here seem con­fi­dent that there ex­ists a de­ci­sion the­ory im­mune to black­mail. I sus­pect this is wish­ful think­ing. It would be nice to use it (if it ex­ists), so as­sume it ex­ists and go look­ing for it: ei­ther you can’t find it, and so it doesn’t mat­ter whether it ex­isted or not in the first place, or you found it, and now you know it ex­isted in the first place and what it is. • Black­mail and Pas­cal’s Mug­ging is not the same. The goal is to make an AI im­mune to Pas­cal’s Mug­ging and most hu­mans are im­mune to Pas­cal’s Mug­ging/​Wager in­her­ently. There is no de­ci­sion the­ory that makes you im­mune to all forms of black­mail, since it is ra­tio­nal to give in in many cases de­pend­ing on how the pay­outs are set up. • In this theme, there is a lot of talk about mak­ing de­ci­sion the­o­ries or util­ity func­tions that are im­mune to pas­cal’s mug­ging, but isn’t the whole FAI sce­nario, give us money ba­si­cally a Pas­cal’s mug­ging? • Yeah, it is if you com­pletely ig­nore the unique and defin­ing fea­ture of all Pas­cal’s mug­ging, the con­di­tion­al­ity of the re­ward on your as­sessed prob­a­bil­ity… ಠ_ಠ • I don’t un­der­stand this. In the origi­nal Pas­cal’s wa­ger, it is sug­gested that you be­come Chris­tian, start go­ing through the mo­tions, and this will even­tu­ally change your be­lief so that you think God’s ex­is­tence is likely. But this is not a fea­ture of the gen­er­al­ized Pas­cal’s mug­ging, at least not as de­scribed on the wiki page. (Also, given that this is the Stupid Ques­tions Thread, I feel your com­ment would be im­proved by more ex­pla­na­tion and less dis­ap­prov­ing emoti­cons...) • In the origi­nal Pas­cal’s wa­ger, it is sug­gested that you be­come Chris­tian, start go­ing through the mo­tions, and this will even­tu­ally change your be­lief so that you think God’s ex­is­tence is likely. No, in the origi­nal Pas­cal’s wa­ger you are ad­vised to be­lieve in God, as God would judge you based on your be­liefs (i.e. your as­sessed prob­a­bil­ity of ex­is­tence). How­ever, that doesn’t seem to be the form of the Pas­cal’s mug­ging, which is also dis­cussed quite a bit on this site. The con­di­tion­al­ity of re­ward or pun­ish­ment on sub­jec­tive prob­a­bil­ity es­ti­mates doesn’t seem to be the point at which de­ci­sion the­o­ries break down, but rather they seem to break down with very small prob­a­bil­ities of very large effects. • Ac­tu­ally, Pas­cal did ad­vise “go­ing through the mo­tions” as a solu­tion to be­ing un­able to sim­ply will one­self into be­lief. The wa­ger might not be strong apolo­get­ics, but I give Pas­cal some credit for his grasp of cog­ni­tive dis­so­nance. • I stand cor­rected. • Yes, this is what I was try­ing to say. I see how the phrase “con­di­tion­al­ity of the re­ward on your as­sessed prob­a­bil­ity” could de­scribe Pas­cal’s Wager, but not how it could de­scribe Pas­cal’s Mug­ging. • More con­cisely than the origi­nal/​gw­ern: The al­gorithm used by the mug­ger is roughly: Find your as­sessed prob­a­bil­ity of the mug­ger be­ing able to de­liver what­ever re­ward, be­ing care­ful to spec­ify the size of the re­ward in the con­di­tions for the probability offer an ex­change such that U(pay­ment to mug­ger) < U(re­ward) * P(re­ward) This is an is­sue for AI de­sign be­cause if you use a prior based on Kol­mogorov com­plex­ity than it’s rel­a­tively straight­for­ward to find such a re­ward, be­cause even very large num­bers have rel­a­tively low com­plex­ity, and there­fore rel­a­tively high prior prob­a­bil­ities. • When you have a bunch of other data, you should be not in­ter­ested in the Kol­mogorov com­plex­ity of the num­ber, you are in­ter­ested in Kol­mogorov com­plex­ity of other data con­cate­nated with that num­ber. E.g. you should not as­sign higher prob­a­bil­ity that Bill Gates has made pre­cisely 100 000 000 000$ than some ran­dom-look­ing value, as given the other sen­sory in­put you got (from which you de­rived your world model) there are ran­dom-look­ing val­ues that have even lower Kol­mogorov com­plex­ity of to­tal sen­sory in­put, but you wouldn’t be able to find those be­cause Kol­mogorov com­plex­ity is un­com­putable. You end up mis-es­ti­mat­ing Kol­mogorov com­plex­ity when you don’t have it given to you on a plat­ter pre-made.

Ac­tu­ally, what you should use is al­gorith­mic (Solomonoff) prob­a­bil­ity, like AIXI does, on the his­tory of sen­sory in­put, to weighted sum among the world mod­els that pre­sent you with the mar­ket­ing spiel of the mug­ger. The short­est ones sim­ply have the mug­ger make it up, then there will be the mod­els where mug­ger will tor­ture be­ings if you pay and not tor­ture if you don’t, it’s un­clear what’s go­ing to hap­pen out of this and how it will pan out, be­cause, again, un­com­putable.

In the hu­man ap­prox­i­ma­tion, you take what mug­ger says as priv­ileged model, which is strictly speak­ing an in­valid up­date (the prob­a­bil­ity jumps from effec­tively zero for never think­ing about it, to nonzero), and the in­valid up­dates come with a cost of be­ing prone to los­ing money. The con­struc­tion of model di­rectly from what mug­ger says the model should be is a hack; at that point any­thing goes and you can have an­other hack, of the strate­gic kind, to not ap­ply this string->model hack to ul­tra ex­traor­di­nary claims with­out ev­i­dence.

edit: i meant, weighted sum, not ‘se­lect’.

• The mug­ging is defined as hav­ing con­di­tion­al­ity; just read Bostrom’s pa­per or Bau­mann’s re­ply! That Eliezer did not ex­plic­itly state the mug­ger’s sim­ple al­gorithm, but in­stead im­plied it in his dis­cus­sion of com­plex­ity and size of num­bers, does not ob­vi­ate this point.

• I don’t un­der­stand this.

• ‘I don’t un­der­stand this’ usu­ally means ‘Would some­body please ex­plain?’.

• I might as well take a shot at ex­plain­ing. Pas­cal’s wa­ger says I might as well take on the rel­a­tively small in­con­ve­nience of go­ing through the mo­tions of be­liev­ing in God, be­cause if the small prob­a­bil­ity event oc­curs that he does ex­ist, the re­ward is ex­tremely large or in­finite (eter­nal life in heaven pre­sum­ably)

Pas­cal’s mug­ging in­stead makes this a rel­a­tively small pay­ment (\$5 as Yud­kowsky phrased it) to avoid or miti­gate a minus­cule chance that some­one may cause a huge amount of harm (putting dust specks in 3^^^^3 peo­ple’s eyes or what­ever the cur­rent ver­sion is)

Thus for both of them peo­ple are faced with mak­ing some small in­vest­ment to, should an event of minus­cule prob­a­bil­ity oc­cur, vastly in­crease their util­ity. A lot­tery ticket

• Is there a bet­ter way to read Less Wrong?

I know I can put the se­quences on my kin­dle, but I would like to find a way to browse Dis­cus­sion and Main in a more use­able in­ter­face (or at least some­thing that I can cus­tomize). I re­ally like the thread­ing or­ga­ni­za­tion of news­groups, and I read all of my .rss feeds and mail through Gnus in emacs. I some­times use the Less Wrong .rss feed in Gnus, but this doesn’t al­low me to read the com­ments. Any sug­ges­tions?

Also, if any other emacs users are in­ter­ested, I would love to make a less­wrong-mode pack­age. I’m not a very good lisp hacker, but I think it would be a fun pro­ject.

• How do I put the se­quences on a Kin­dle?

• Thumbs up from me for less­wrong-mode!

• I at­tempted this to­day but with­out an API (LW’s fork of the red­dit code­base looks pretty old) I don’t think I can get very far.

• I kind of wish talk about new­comb’s prob­lem was pre­sented in terms of source code and AI rather than the more com­mon pre­sen­ta­tion, since I think it’s much more ob­vi­ous what is be­ing aimed at when you think about it this way. Is there a rea­son peo­ple pre­fer the origi­nal ver­sion?

• Most peo­ple aren’t AI’s or even pro­gramers (though the lat­ter are fairly com­mon on LW).

• Most peo­ple also aren’t pre­sented with Omega situ­a­tions. The rea­son it’s im­por­tant to solve new­comb’s prob­lem is so that we can make an AI that will re­spond to the in­cen­tives we give it to self-mod­ify in ways we want it to.

• Most peo­ple find the ver­bal de­scrip­tions eas­ier to han­dle.

• Most peo­ple are much more eas­ily mis­led via ver­bal de­scrip­tions.

• Maybe the right thing to do is to mix and match differ­ent pre­sen­ta­tions of the prob­lem? I.e. one per­son might be all like “huh??” or “this is stupid” when­ever New­comb’s prob­lem is dis­cussed, but be like “oh NOW I get it” when it’s pre­sented in terms of AI and source code. Some­body else might be the op­po­site.

• Doesn’t the pa­per cited here on acausal ro­mance im­ply that gains from acausal trade are in­co­her­ent?

The fact that I can imag­ine some­one who can imag­ine ex­actly me doesn’t seem like it im­plies that I can make ma­te­rial gains by act­ing in refer­ence to that in­ac­cessible other.

What am I mi­s­un­der­stand­ing?

• That’s the joke. The whole pa­per is a “Modest Pro­posal” style satire.

It’s de­signed to tear down modal re­al­ism by tak­ing it to the most ab­surd ex­treme. I also de­tected hints of play­ing on the On­tolog­i­cal Ar­gu­ment for the ex­is­tence of God.

• Thanks for that ex­pla­na­tion of the pa­per. Is acausal trade sup­posed to rely on modal re­al­ism, or are they dis­tinct?

• If I’m a moral anti-re­al­ist, do nec­es­sar­ily I be­lieve that prov­ably Friendly AI is im­pos­si­ble? When defin­ing friendly, con­sider Archimedes’ Chrono­phone, which sug­gests that friendly AI would (should?) be friendly to just about any hu­man who ever lived.

moral anti-re­al­ism—there are no (or in­suffi­cient) moral facts to re­solve all moral dis­putes an agent faces.

• No. FAI is sup­posed to im­ple­ment an ex­trap­o­lated ver­sion of mankind’s com­bined val­ues, not search for an ob­jec­tively defined moral code to im­ple­ment.

Also: Eliezer has ar­gued that even from it’s pro­gram­mers’ per­spec­tive, some el­e­ments of a FAI’s moral code (Co­her­ent Ex­trap­o­lated Vo­li­tion) will prob­a­bly look deeply im­moral. (But will ac­tu­ally be OK.)

• Why does the moral anti-re­al­ist think “an ex­trap­o­lated ver­sion of mankind’s com­bined val­ues” ex­ists or is ca­pa­ble of be­ing cre­ated? For the moral re­al­ists, the an­swer is easy—the ex­is­tence of ob­jec­tive moral facts shows that, in prin­ci­ple, some moral sys­tem that all hu­mans could en­dorse could be dis­cov­ered/​ar­tic­u­lated.

As an aside, CEV is a pro­posed method for find­ing what an FAI would im­ple­ment. I think that one could think FAI is pos­si­ble even if CEV were the wrong track for find­ing what FAI should do. In should, CEV is not nec­es­sar­ily part of the defi­ni­tion of Friendly.

• Well, to as­sert that “an ex­trap­o­lated ver­sion of mankind’s com­bined val­ues can be cre­ated” doesn’t re­ally as­sert much, in and of it­self… just that some al­gorithm can be im­ple­mented that takes mankind’s val­ues as in­put and gen­er­ates a set of val­ues as out­put. It seems pretty likely that a large num­ber of such al­gorithms ex­ist.

Of course, what CEV pro­po­nents want to say, ad­di­tion­ally, is that some of these al­gorithms are such that their out­put is guaran­teed to be some­thing that hu­mans ought to en­dorse. (Which is not to say that hu­mans ac­tu­ally would en­dorse it.)

It’s not even clear to me that moral re­al­ists should be­lieve this. That is, even if I posit that ob­jec­tive moral facts ex­ist, it doesn’t fol­low that they can be de­rived from any al­gorithm ap­plied to the con­tents of hu­man minds.

But I agree with you that it’s still less clear why moral anti-re­al­ists should be­lieve it.

• Peo­ple can hold differ­ent moral views. Some­times these views are op­posed and any com­pro­mise would be called im­moral by at least one of them. Any AI that en­forced such a com­pro­mise, would be called unFriendly by at least one of them.

Even for a moral re­al­ist (and I don’t think well of that po­si­tion), the above re­mains true, be­cause peo­ple demon­stra­bly have ir­rec­on­cilably differ­ent moral views. If you’re a moral re­al­ist, you have the choice of:

1. Im­ple­ment ob­jec­tive moral truth how­ever defined, and ig­nore ev­ery­one’s ac­tual moral feel­ings. In which case FAI is ir­rele­vant—if the moral truth tells you to be unFriendly, you do it.

2. Im­ple­ment some pre-cho­sen func­tion—your own morals, or many peo­ple’s morals like CEV, or some other thing that does not de­pend on moral truth.

If you’re a moral anti-re­al­ist, you can only choose 2, be­cause no moral truth ex­ists. That’s the only differ­ence stem­ming from be­ing a moral re­al­ist or anti-re­al­ist.

Does this mean that Friendly-to-ev­ery­one AI is im­pos­si­ble in moral anti-re­al­ism? Cer­tainly, be­cause peo­ple have fun­da­men­tal moral dis­agree­ments. But moral re­al­ism doesn’t help! It just adds the op­tion of fol­low­ing some “moral facts” which some or all hu­mans dis­agree with, which is no bet­ter in terms of Friendli­ness than ex­ist­ing op­tions. (If all hu­mans agreed with some set of pur­ported moral facts, peo­ple wouldn’t have needed to in­vent the con­cept of moral facts in the first place.)

• The ex­is­tence of moral dis­agree­ment, stand­ing alone, is not enough to show moral re­al­ism is false. After all, sci­en­tific dis­agree­ment doesn’t show phys­i­cal re­al­ism is false.

Fur­ther, I am con­fused by your por­trayal of moral re­al­ists. Pre­sum­ably, the re­al­ity of moral facts would show that peo­ple act­ing con­trary to those facts were mak­ing a mis­take, much like peo­ple who thought “Ob­jects in mo­tion will tend to come to a stop” were mak­ing a mis­take. It seems strange to call cor­rect­ing that mis­take “ig­nor­ing ev­ery­one’s ac­tual sci­en­tific feel­ings.” Like­wise, if I am un­know­ingly do­ing wrong, and you can prove it, I would not view that cor­rec­tion as ig­nor­ing my moral feel­ings—I want to do right, not just think I am do­ing right.

In short, I think that the po­si­tion you are la­bel­ing “moral re­al­ist” is just a very con­fused ver­sion of moral anti-re­al­ism. Mo­ral re­al­ists can and should re­ject that idea that the mere ex­is­tence at any par­tic­u­lar mo­ment of moral dis­agree­ment is use­ful ev­i­dence of whether there is one right an­swer. In other words, a dis­tinc­tion should be made be­tween the ex­is­tence of moral dis­agree­ment and the long-term per­sis­tence of moral dis­agree­ment.

• The ex­is­tence of moral dis­agree­ment, stand­ing alone, is not enough to show moral re­al­ism is false.

I didn’t say that it was. Rather I pointed out the differ­ence be­tween moral­ity and Friendli­ness.

For an AI to be able to be Friendly to­wards ev­ery­one re­quires not moral re­al­ism, but “friendli­ness re­al­ism”—which is ba­si­cally the idea that a sin­gle be­hav­ior of the AI can satisfy ev­ery­one. This is clearly false if “ev­ery­one” means “all in­tel­li­gences in­clud­ing aliens, other AIs, etc.” It may be true if we re­strict our­selves to “all hu­mans” (and stop hu­mans from di­ver­sify­ing too much, and don’t in­clude hy­po­thet­i­cal or far-past hu­mans).

I, per­son­ally, be­lieve the bur­den of proof is on those who be­lieve this to be pos­si­ble to demon­strate it. My prior for “all hu­mans” says they are a very di­verse and self­ish bunch and not go­ing to be satis­fied by any one ar­range­ment of the uni­verse.

Re­gard­less, moral re­al­ism and friendli­ness re­al­ism are differ­ent. If you built an ob­jec­tively moral but unFriendly AI, that’s the sce­nario I dis­cussed in my pre­vi­ous com­ment—and peo­ple would be un­happy. OTOH, if you think a Friendly AI is by log­i­cal ne­ces­sity a moral one (un­der moral re­al­ism), that’s a very strong claim about ob­jec­tive morals—a claim that peo­ple would per­ceive an AI im­ple­ment­ing ob­jec­tive morals as Friendly. This is a far stronger claim than that peo­ple who are suffi­ciently ed­u­cated and ex­posed to the right knowl­edge will come to agree with cer­tain uni­ver­sal ob­jec­tive morals. A Friendly AI means one that is Friendly to peo­ple as they re­ally are, here and now. (As I said, to me it seems very likely that an AI can­not in fact be Friendly to ev­ery­one at once.)

• I think we are sim­ply hav­ing a defi­ni­tional dis­pute. As the term is used gen­er­ally, moral re­al­ism doesn’t mean that each agent has a moral­ity, but that there are facts about moral­ity that are ex­ter­nal to the agent (i.e. ob­jec­tive). Now, “ob­jec­tive” is not iden­ti­cal to “uni­ver­sal,” but in prac­tice, ob­jec­tive facts tend to cause con­ver­gence of be­liefs. So I think what I am call­ing “moral re­al­ism” is some­thing like what you are call­ing “Friendli­ness re­al­ism.”

Length­en­ing the in­fer­en­tial dis­tance fur­ther is that re­al­ism is a two place word. As you noted, there is a dis­tinc­tion be­tween re­al­ism(Friendli­ness, agents) and re­al­ism(Friendli­ness, hu­mans).

That said, I do think that “peo­ple would per­ceive an AI im­ple­ment­ing ob­jec­tive morals as Friendly” if I be­lieved that ob­jec­tive morals ex­ist. I’m not sure why you think that’s a stronger claim than “peo­ple who are suffi­ciently ed­u­cated and ex­posed to the right knowl­edge will come to agree with cer­tain uni­ver­sal ob­jec­tive morals.” If you be­lieved that there were ob­jec­tive moral facts and knew the con­tent of those facts, wouldn’t you try to ad­just your be­liefs and ac­tions to con­form to those facts, in the same way that you would ad­just your phys­i­cal-world be­liefs to con­form to ob­jec­tive phys­i­cal facts?

• I think we are sim­ply hav­ing a defi­ni­tional dis­pute.

That seems likely. If moral re­al­ists think the moral­ity is a one-place word, and anti re­al­ists think it’s a two place word, we would be bet­ter served by us­ing two dis­tinct words.

It is some­what un­clear to me what moral re­al­ists are think­ing of, or claiming, about what­ever it is they call moral­ity. (Even af­ter tak­ing into ac­count that differ­ent peo­ple iden­ti­fied as moral re­al­ists do not all agree on the sub­ject.)

So I think what I am call­ing “moral re­al­ism” is some­thing like what you are call­ing “Friendli­ness re­al­ism.”

I defined ‘Friendli­ness (to X)’ as ‘be­hav­ing to­wards X in the way that is best for X in some im­plied sense’. Ob­vi­ously there is no Friendli­ness to­wards ev­ery­one, but there might be Friendli­ness to­wards hu­mans: then “Friendli­ness re­al­ism” (my coin­ing) is the be­lief that there is a sin­gle Friendly-to­wards-hu­mans be­hav­ior that will in fact be Friendly to­wards all hu­mans. Whereas Friendli­ness anti-re­al­ism is the be­lief no one be­hav­ior would satisfy all hu­mans, and it would in­evitably be unFriendly to­wards some of them.

Clearly this dis­cus­sion as­sumes many givens. Most im­por­tantly, 1) what ex­actly counts as be­ing Friendly to­wards some­one (are we util­i­tar­ian? what kind? must we agree with the tar­get hu­man as to what is Friendly to­wards them? If we in­fluence them to come to like us, when is that al­lowed?). 2) what is the set of ‘all hu­mans’? Do past, dis­tant, fu­ture ex­pected, or en­tirely hy­po­thet­i­cal peo­ple count? What is the value of cre­at­ing new peo­ple? Etc.

My po­si­tion is that: 1) for most com­mon as­sumed an­swers to these ques­tions, I am a “Friendli­ness anti-re­al­ist”; I do not be­lieve any one be­hav­ior by a su­per­pow­er­ful uni­verse-op­ti­miz­ing AI would count as Friendli­ness to­wards all hu­mans at once. And 2), inas­far as I have seen moral re­al­ism ex­plained, it seems to me to be in­com­pat­i­ble with Friendli­ness re­al­ism. But it’s pos­si­ble some peo­ple mean some­thing en­tirely differ­ent by “morals” and by “moral re­al­ism” than what I’ve read.

If you be­lieved that there were ob­jec­tive moral facts and knew the con­tent of those facts, wouldn’t you try to ad­just your be­liefs and ac­tions to con­form to those facts

That’s a tau­tol­ogy: yes I would. But, the as­sump­tion is not valid.

Even if you as­sume there ex­ist ob­jec­tive moral facts (what­ever you take that to mean), it does not fol­low that you would be able to con­vince other peo­ple that they are true moral facts! I be­lieve it is ex­tremely likely you would not be able to con­vince peo­ple—just as to­day most peo­ple in the world seem to be moral re­al­ists (mostly re­li­gious), and yet hold widely differ­ing moral be­liefs and when they con­vert to an­other set of be­liefs it is al­most never due to some sort of ra­tio­nal con­vinc­ing.

It would be nice to live in a world where you could start from the premise that “peo­ple be­lieve that there are ob­jec­tive moral facts and know the con­tent of those facts”. But in prac­tice we, and any fu­ture FAI, will live in a world where most peo­ple will re­ject mere ver­bal ar­gu­ments in fa­vor of new morals con­tra­dict­ing their cur­rent ones.

• If I’m a moral anti-re­al­ist, do nec­es­sar­ily I be­lieve that prov­ably Friendly AI is im­pos­si­ble?

No. I mean, I’m un­sure about the pos­si­bil­ity of prov­ably Friendly AI but it’s not ob­vi­ous that anti-re­al­ism makes it im­pos­si­ble. Mo­ral re­al­ism, were it the case, might make things eas­ier but it’s hard for me to imag­ine what that world looks like.

• Let us define a moral­ity func­tion F() as tak­ing as in­put x=the fac­tual cir­cum­stances an agent faces in mak­ing a de­ci­sion, out­putting y=the de­ci­sion the agent makes. It is fairly ap­par­ent that prac­ti­cally ev­ery agent has an F(). So ELIEZER(x) is the func­tion that de­scribes what Eliezer would choose in situ­a­tion x. Next, define GROUP{} as the set of moral­ity func­tions run by all the mem­bers of that group.

Let us define CEV() as the func­tion that takes as in­put a moral­ity func­tion or set of moral­ity func­tions and out­puts a moral­ity func­tion that is im­proved/​made con­sis­tent/​ex­trap­o­lated from the in­put. I’m not as­sert­ing the ac­tual CEV for­mu­la­tion will do that, but it is a ges­ture to­wards the goal that CEV() is sup­posed to solve.

For clar­ity, let the out­put of CEV(F()) = CEV.F(). Thus, CEV.ELIEZER() is the ex­trap­o­lated moral­ity from the moral­ity Eliezer is run­ning. In par­allel CEV.AMERICA() (which is the out­put of CEV(AMERICA{})) the sin­gle moral func­tion that is the ex­trap­o­lated moral­ity of ev­ery­one in the United States. If CEV() ex­ists, an AI con­sid­er­ing/​im­ple­ment­ing CEV.JOHNDOE() is Friendly to John Doe. Like­wise, CEV.GROUP() leads to an AI that is Friendly to ev­ery mem­ber of the group.

For FAI to be pos­si­ble, CEV() must out­put for (A) any moral­ity func­tion or (B) set of moral­ity func­tions. Fur­ther, for prov­able FAI, it must be pos­si­ble to (C) math­e­mat­i­cally show the out­put of CEV() be­fore turn­ing on the AI.

If moral re­al­ism is false, why is there rea­son to think (A), (B), or (C) are true?

• For FAI to be pos­si­ble, CEV() must out­put for (A) any moral­ity func­tion or (B) set of moral­ity functions

Any set? Why not just re­quire that CEV.HUMANITY() be pos­si­ble? It seems like there are some sets of moral­ity func­tions G that would be im­pos­si­ble (G={x, ~x}?). Hu­man value is re­ally com­plex so it’s a difficult thing to a)model it and b) prove the model. Ob­vi­ously I don’t know how to do that; no one does yet. If moral re­al­ism were true and moral­ity were sim­ple and know­able I sup­pose that would make the job a lot eas­ier… but that doesn’t seem like a world that is still pos­si­ble. Con­versely, moral­ity could be both real and un­know­able and im­pos­si­bly com­pli­cated and then we’d be even in worse shape be­cause learn­ing about hu­man val­ues wouldn’t even tell us how to do Friendly AI! Maybe if you gave me some idea of what your al­ter­na­tive to anti-re­al­ism would look like I could an­swer bet­ter. In short: Friendli­ness is re­ally hard, part of the rea­son it seems so hard to me might have to do with my moral anti-re­al­ism but I have trou­ble imag­in­ing plau­si­ble re­al­ist wor­lds where things are eas­ier.

• First, a ter­minol­ogy point: CEV.HUMANITYCURRENTLYALIVE() != CEV.ALLHUMANITYEVER(). For the anti-re­al­ist, CEV.HUMANITYCURRENTLYALIVE() is mas­sively more plau­si­ble, and CEV.LONDON() is more plau­si­ble than that—but my sense is that this sen­tence de­pends on the anti-re­al­ist ac­cept­ing of some fla­vor of moral rel­a­tivism.

Se­cond, it seems likely that fairly large groups (i.e. the pop­u­la­tion of Lon­don) already have some {P, ~P}. That’s one rea­son to think mak­ing CEV() is re­ally hard.

Hu­man value is re­ally com­plex so it’s a difficult thing to a)model it and b) prove the model.

I don’t un­der­stand what prov­ing the model means in this con­text.

If moral re­al­ism were true and moral­ity were sim­ple and know­able I sup­pose that would make the job a lot eas­ier… but that doesn’t seem like a world that is still pos­si­ble.

I don’t un­der­stand why you talk about pos­si­bil­ity. “Mo­ral­ity is true, sim­ple, and know­able” seems like an em­piri­cal propo­si­tion: it just turns out to be false. It isn’t ob­vi­ous to me that sim­ple moral re­al­ism is nec­es­sar­ily false in the way that 2+5=7 is nec­es­sar­ily true.

Con­versely, moral­ity could be both real and unknowable

How does the world look differ­ent if moral­ity is real and in­ac­cessible vs. not real?

Maybe if you gave me some idea of what your al­ter­na­tive to anti-re­al­ism would look like I could an­swer bet­ter.

Pace cer­tain is­sues about hu­man ap­petites as ob­jec­tive things, I am an anti-re­al­ist—in case that wasn’t clear.

• First, a ter­minol­ogy point: CEV.HUMANITYCURRENTLYALIVE() != CEV.ALLHUMANITYEVER

Sure sure. But CEV.ALLHUMANITYEVER is also not the same as all CEV.ALLPOSSIBLEAGENTS.

Se­cond, it seems likely that fairly large groups (i.e. the pop­u­la­tion of Lon­don) already have some {P, ~P}.

Some sub­rou­tines are prob­a­bly in­verted but there prob­a­bly aren’t peo­ple with fully negated util­ity func­tions from other peo­ple. Trade-offs needn’t mean ir­rec­on­cilable differ­ences. Like I doubt there is any­one in the world who cares as much as you do about the ex­act op­po­site of ev­ery­thing you care about.

Hu­man value is re­ally com­plex so it’s a difficult thing to a)model it and b) prove the model.

I don’t un­der­stand what prov­ing the model means in this con­text.

Show with some con­fi­dence that it doesn’t lead to ter­rible out­comes if im­ple­mented.

Mo­ral­ity is true, sim­ple, and know­able” seems like an em­piri­cal propo­si­tion: it just turns out to be false. It isn’t ob­vi­ous to me that sim­ple moral re­al­ism is nec­es­sar­ily false in the way that 2+5=7 is nec­es­sar­ily true.

I’m not sure that it is. But when I said “still” pos­si­ble I meant that we have more than enough ev­i­dence to rule out the pos­si­bil­ity that we are liv­ing in such a world. I didn’t mean to im­ply any be­liefs about ne­ces­sity. That said I am pretty con­fused about what it would mean for there to be ob­jec­tive facts about right and wrong. Usu­ally I think true be­liefs are sup­posed to con­strain an­ti­ci­pated ex­pe­rience. Since moral judg­ments don’t do that… I’m not quite sure I know what moral re­al­ism would re­ally mean.

How does the world look differ­ent if moral­ity is real and in­ac­cessible vs. not real?

I imag­ine it wouldn’t look differ­ent but since there is no ob­vi­ous way of prov­ing a moral­ity log­i­cally or em­piri­cally I can’t see how moral re­al­ists would be able to rule it out.

Pace cer­tain is­sues about hu­man ap­petites as ob­jec­tive things, I am an anti-re­al­ist—in case that wasn’t clear.

Oh I un­der­stand that. I just meant that when you ask:

If I’m a moral anti-re­al­ist, do nec­es­sar­ily I be­lieve that prov­ably Friendly AI is im­pos­si­ble?

I’m won­der­ing “Op­posed to what?”. I’m hav­ing trou­ble imag­in­ing the per­son for whom the prospects of Friendly AI are much brighter be­cause they are a moral re­al­ist.

• If I’m a moral anti-re­al­ist, do nec­es­sar­ily I be­lieve that prov­ably Friendly AI is im­pos­si­ble?

I’m won­der­ing “Op­posed to what?”. I’m hav­ing trou­ble imag­in­ing the per­son for whom the prospects of Friendly AI are much brighter be­cause they are a moral re­al­ist.

It seems to me that moral re­al­ists have more rea­son to be op­ti­mistic about prov­ably friendly AI than anti-re­al­ists. The steps to com­ple­tion are rel­a­tively straight­for­ward: (1) Ri­gor­ously de­scribe the moral truths that make up the true moral­ity. (2) Build an AGI that max­i­mizes what the true moral­ity says to max­i­mize.

I’m not quite sure I know what moral re­al­ism would re­ally mean.

I think Alice, a uni­tary moral re­al­ist, be­lieves she is jus­tified in say­ing: “Any­one whose moral­ity func­tion does not out­put Q in situ­a­tion q is a defec­tive hu­man, roughly analo­gous to the way any hu­man who never feels hun­gry is defec­tive in some way.”

Bob, a plu­ral­ist moral re­al­ist, would say: “Any­one whose moral­ity func­tion does not out­put from the set {Q1, Q2, Q3} in situ­a­tion q is a defec­tive hu­man.”

Char­lie, a moral anti-re­al­ist, would say Alice and Bob’s state­ments are both mis­lead­ing, be­ing his­tor­i­cally con­tin­gent, or in­ca­pable of be­ing eval­u­ated for truth, or some other prob­lem.

Con­sider the fol­low­ing state­ment:

“Every (moral) de­ci­sion a hu­man will face has a sin­gle choice that is most con­sis­tent with hu­man na­ture.”

To me, that po­si­tion im­plies that moral re­al­ism is true. If you dis­agree, could you ex­plain why?

I imag­ine it wouldn’t look differ­ent [if moral­ity is real and in­ac­cessible vs. not real] but since there is no ob­vi­ous way of prov­ing a moral­ity log­i­cally or em­piri­cally I can’t see how moral re­al­ists would be able to rule it out.

What is at stake in the dis­tinc­tion? A set of facts that can­not have causal effect might as well not ex­ist. Com­pare er­ror the­o­rists to in­ac­cessibil­ity moral re­al­ists—the former say value state­ments can­not be eval­u­ated for truth, the lat­ter say value state­ments could be true, but in prin­ci­ple, we will never know. For any ac­tual prob­lem, both schools of thought recom­mend the same stance, right?

• moral re­al­ists have more rea­son to be op­ti­mistic about prov­ably friendly AI than anti-re­al­ists. The steps to com­ple­tion are rel­a­tively straight­for­ward: (1) Ri­gor­ously de­scribe the moral truths that make up the true moral­ity. (2) Build an AGI that max­i­mizes what the true moral­ity says to max­i­mize.

Is step 1 even nec­es­sary? Pre­sum­ably in that uni­verse one could just build an AGI that was smart enough to in­fer those moral truths and im­ple­ment them, and turn it on se­cure in the knowl­edge that even if it im­me­di­ately started dis­assem­bling all available mat­ter to make prime-num­bered piles of pa­per­clips, it would be do­ing the right thing. No?

• That’s an in­ter­est­ing point. I sup­pose it de­pends on whether a moral re­al­ist can think some­thing can be morally right for one class of agents and morally wrong for an­other class. I think such a po­si­tion is con­sis­tent with moral re­al­ism. If that is a moral re­al­ist po­si­tion, then the AI pro­gram­mer should be wor­ried that an un­con­strained AI would nat­u­rally de­velop a moral­ity func­tion differ­ent than CEV.HUMANITY().

In other words, when we say moral re­al­ist, are we us­ing a two part word with un­for­tu­nate am­bi­guity be­tween re­al­ism(moral­ity, agent) and re­al­ism(moral­ity, hu­mans)? Wow, I never con­sid­ered whether this was part of the in­fer­en­tial dis­tance in these types of dis­cus­sions.

• Well, to start with, I would say that CEV is beside the point here. In a uni­verse where there ex­ist moral truths that make up the true moral­ity, if what I want is to do the right thing, there’s no par­tic­u­lar rea­son for me to care about any­one’s vo­li­tion, ex­trap­o­lated or oth­er­wise. What I ought to care about is dis­cern­ing those moral truths. Maybe I can dis­cern them by an­a­lyz­ing hu­man psy­chol­ogy, maybe by an­a­lyz­ing the hu­man genome, maybe by an­a­lyz­ing the phys­i­cal struc­ture of car­bon atoms, maybe by an­a­lyz­ing the for­mal prop­er­ties of cer­tain kinds of com­pu­ta­tions, I dunno… but what­ever lets me figure out those moral truths, that is what I ought to be at­tend­ing to in such a uni­verse, and if hu­man­ity’s vo­li­tion con­flicts with those truths, so much the worse for hu­man­ity.

So the fact that an un­con­strained AI might—or even is guaran­teed to—de­velop a moral­ity func­tion differ­ent than CEV.HUMANITY() is not, in that uni­verse, a rea­son not to build an un­con­strained AI. (Well, not a moral rea­son, any­way. I can cer­tainly choose to forego do­ing the right thing in that uni­verse if it turns out to be some­thing I per­son­ally dis­like, but only at the cost of be­hav­ing im­morally.)

But that’s beside your main point, that even in that uni­verse the moral truths of the uni­verse might be such that differ­ent be­hav­iors are most right for differ­ent agents. I agree with this com­pletely. Another way of say­ing it is that to­tal right­ness is po­ten­tially max­i­mized when differ­ent agents are do­ing (spe­cific) differ­ent things. (This might be true in a non-moral-re­al­ist uni­verse as well.)

Ac­tu­ally, it may be use­ful here to be ex­plicit about what we think a moral truth is in that uni­verse. That is, is it a fact about the cor­rect state of the world? Is it a fact about the cor­rect be­hav­ior of an agent in a given situ­a­tion, in­de­pen­dent of con­se­quences? Is it a fact about the cor­rect way to be, re­gard­less of be­hav­ior or con­se­quences? Is it some­thing else?

• Could some­one please ex­plain to me ex­actly, pre­cisely, what a util­ity func­tion is? I have seen it called a perfectly well-defined math­e­mat­i­cal ob­ject as well as not-vague, but as far as I can tell, no one has ever ex­plained what one is, ever.

The words “pos­i­tive af­fine trans­for­ma­tion” have been used, but they fly over my head. So the For Dum­mies ver­sion, please.

• Given an agent with some set X of choices, a util­ity func­tion u maps from the set X to the real num­bers R. The map­ping is such that the agent prefers x1 to x2 if and only if u(x1) > u(x2). This com­pletes the defi­ni­tion of an or­di­nal util­ity func­tion.

A car­di­nal util­ity func­tion satis­fies ad­di­tional con­di­tions which al­low easy con­sid­er­a­tion of prob­a­bil­ities. One way to state these con­di­tions is that prob­a­bil­ities defined on X are re­quired to be lin­ear over u. This means that we can now con­sider prob­a­bil­is­tic mixes of choices from X (with prob­a­bil­ities sum­ming to 1). For ex­am­ple, one valid mix would be 0.25 prob­a­bil­ity of x1 with 0.75 prob­a­bil­ity of x2, and a sec­ond valid mix would be 0.8 prob­a­bil­ity of x3 with 0.2 prob­a­bil­ity of x4. A car­di­nal util­ity func­tion must satisfy the con­di­tion that the agent prefers the first mix to the sec­ond mix if and only if 0.25u(x1) + 0.75u(x2) > 0.8u(x3) + 0.2u(x4).

Car­di­nal util­ity func­tions can also be for­mal­ized in other ways. E.g., an­other way to put it is that the rel­a­tive differ­ences be­tween util­ities must be mean­ingful. For in­stance, if u(x1) - u(x2) > u(x3) - u(x4), then the agent prefers x1 to x2 more than it prefers x3 to x4. (This prop­erty need not hold for or­di­nal util­ity func­tions.)

Other notes:

• In my ex­pe­rience, or­di­nal util­ity func­tions are nor­mally found in eco­nomics, whereas car­di­nal util­ity func­tions are found in game the­ory (where they are es­sen­tial for any dis­cus­sion of mixed strate­gies). Most, if not all, dis­cus­sions on LW use car­di­nal util­ity func­tions.

• The VNM the­o­rem is an in­cred­ibly im­por­tant re­sult on car­di­nal util­ity func­tions. Ba­si­cally, it shows that any agent satis­fy­ing a few ba­sic ax­ioms of ‘ra­tio­nal­ity’ has a car­di­nal util­ity func­tion. (How­ever, we know that hu­mans don’t satisfy these ax­ioms. To model hu­man be­hav­ior, one should in­stead use the de­scrip­tive prospect the­ory.)

• Be­ware of er­ro­neous straw char­ac­ter­i­za­tions of util­ity func­tions (re­cent ex­am­ple). Re­mem­ber the VNM the­o­rem—very fru­gal as­sump­tions are suffi­cient to show the ex­is­tence of a car­di­nal util­ity func­tion. In a sense, this means that util­ity func­tions can model any set of prefer­ences that are not log­i­cally con­tra­dic­tory.

• Or­di­nal util­ity func­tions are equiv­a­lent up to strictly in­creas­ing trans­for­ma­tions, whereas car­di­nal util­ity func­tions are equiv­a­lent only up to pos­i­tive af­fine trans­for­ma­tions.

• Utility func­tions are of­ten called pay­off func­tions in game the­ory.

• Wik­tionary seems to have a de­cent defi­ni­tion. It boils down to list­ing all pos­si­ble out­comes and or­der­ing them ac­cord­ing to your prefer­ences. The words “af­fine trans­for­ma­tion” re­flect the fact that all pos­si­ble ways to as­sign num­bers to out­comes which re­sult in the same or­der­ing are equiv­a­lent.

• So, in the spirit of stupid (but nag­ging) ques­tions:

The se­quences pre­sent a con­vinc­ing case (to me at least) that MWI is the right view of things, and that it is the best con­clu­sion of our un­der­stand­ing of physics. Yet I don’t be­lieve it, be­cause it seems to be in di­rect con­flict with the fact of ethics: if all I can do is push the bad­ness out of my path, and into some other path, then I can’t see how do­ing good things mat­ters. I can’t change the fun­da­men­tal amount of good­ness, I can just push it around. Yet it mat­ters that I’m good and not bad.

The ‘keep your own path clean’ an­swer is very un­satis­fy­ing, just be­cause it doesn’t work any­where else. I can’t just keep my own fam­ily, neigh­bor­hood, city, coun­try, or planet clean. I can’t even just de­cide to keep my own tem­po­ral chunk of the uni­verse clean, while ig­nor­ing the rest and even at the ex­pense of the rest of it. Why should this prin­ci­ple sud­denly work in the case of other wor­lds? It seems ad hoc.

So my stupid ques­tion is this: why aren’t MWI and ethics just flatly in con­flict?

• be­cause it seems to be in di­rect con­flict with the fact of ethics

Ac­tual an­swers aside, as a ra­tio­nal­ist, this phrase should cause you to panic.

• What do you mean by in con­flict? Believ­ing one says noth­ing about the other. You’re not “push­ing” any­thing around. If you act good in one set of uni­verses, that is a set of uni­verses made bet­ter by your ac­tions. If you act bad in an­other, the same thing. Act­ing good does not cause other uni­verses to be­come bad.

• Peo­ple mak­ing de­ci­sions are not quan­tum events. When a pho­ton could ei­ther end up in a de­tec­tor or not, there are branches where it does and branches where it doesn’t. But when you de­cide whether or not to do some­thing good, this de­ci­sion is be­ing car­ried out by neu­rons, which are big enough that quan­tum events do not in­fluence them much. This means that if you de­cide to do some­thing good, you prob­a­bly also de­cided to do the same good thing in the over­whelming ma­jor­ity of Everette branches that di­verge from when you started con­sid­er­ing the de­ci­sion.

• This may be true, but I don’t think any­one knows for sure, and it seems likely to me that the brain has the prop­erty of sen­si­tivity to ini­tial con­di­tions, mean­ing that it’s likely to do differ­ent stuff in differ­ent Everett branches.

So this sug­gests that [....] over timescales like that of hu­man his­tory we will see an over­whelm­ingly large num­ber of uni­verses that are com­pletely iden­ti­cal on the hu­man level—ones where elec­trons ended up in slightly differ­ent po­si­tions but no harm done [...]

More on-topic for the grand­par­ent: Greg Egan’s nov­ella Or­a­cle talks about the eth­i­cal is­sue of bad stuff hap­pen­ing in other Everett branches.

• The fact that I can re­li­ably mul­ti­ply num­bers shows that at least some of my de­ci­sions are de­ter­minis­tic.

To the ex­tent that I make eth­i­cal de­ci­sions based on some par­tially de­ter­minis­tic rea­son­ing pro­cess, my eth­i­cal de­ci­sions are not chaotic.

If, due to chaos, I have a prob­a­bil­ity p of slap­ping my friends in­stead of hug­ging them, then Laplace’s law of suc­ces­sion tells me that p is less than 1%.

• There must be chaotic am­plifi­ca­tion of quan­tum events go­ing on. Any macro­scopic sys­tem at finite tem­per­a­ture will be full of quan­tum events, like a molecule in an ex­cited state re­turn­ing to its ground state. The quan­tum ran­dom­ness is a con­stant source of “noise” which nor­mally av­er­ages out, but some­times there will be fluc­tu­a­tions away from a mean, and some­times they will be am­plified into meso­scopic and macro­scopic differ­ence. This must be true, but it would be best to have a math­e­mat­i­cal demon­stra­tion, e.g. that the im­pact of quan­tum fluc­tu­a­tions on the trans­fer of heat through an at­mo­sphere will am­plify into macro­scop­i­cally differ­ent weather pat­terns on a cer­tain timescale.

• I have taken lots of de­ci­sions based on ran­dom bits from Four­milab or ran­dom.org (es­pe­cially be­fore find­ing LessWrong—nowa­days I only do that when de­cid­ing which pass­word to use and stuff like that).

• The se­quences pre­sent a con­vinc­ing case (to me at least) that MWI is the right view of things, and that it is the best con­clu­sion of our un­der­stand­ing of physics.

Just a cau­tion, here. The se­quences only re­ally talk about non-rel­a­tivis­tic quan­tum me­chan­ics (NRQM), and I agree that MWI is the best in­ter­pre­ta­tion of this the­ory. How­ever, NRQM is false, so it doesn’t fol­low that MWI is the “right view of things” in the gen­eral sense. Quan­tum field the­ory (QFT) is closer to the truth, but there are a num­ber of bar­ri­ers to a straight­for­ward im­por­ta­tion of MWI into the lan­guage of QFT. I’m rea­son­ably con­fi­dent that an MWI-like in­ter­pre­ta­tion of QFT can be con­structed, but it does not ex­ist in any rigor­ous form as of yet (as far as I am aware, at least). You should be aware of this be­fore com­mit­ting your­self to the claim that MWI is an ac­cu­rate de­scrip­tion of the world, rather than just the best way of con­cep­tu­al­iz­ing the world as de­scribed by NRQM.

• Quan­tum field the­ory (QFT) is closer to the truth, but there are a num­ber of bar­ri­ers to a straight­for­ward im­por­ta­tion of MWI into the lan­guage of QFT

This is im­por­tant if true, and I would like to know more. What are the bar­ri­ers?

I’m rea­son­ably con­fi­dent that an MWI-like in­ter­pre­ta­tion of QFT can be con­structed, but it does not ex­ist in any rigor­ous form as of yet

On the other hand, my un­der­stand­ing is that QFT it­self doesn’t ex­ist in a rigor­ous form yet, ei­ther.

• This ar­ti­cle (PDF) gives a nice (and fairly ac­cessible) sum­mary of some of the is­sues in­volved in ex­tend­ing MWI to QFT. See sec­tions 4 and 8 in par­tic­u­lar. Their fo­cus in the pa­per is wave­func­tion re­al­ism, but given that MWI (at least the ver­sion ad­vo­cated in the Se­quences) is com­mit­ted to wave­func­tion re­al­ism, their ar­gu­ments ap­ply. They offer a sug­ges­tion of the kind of the­ory that they think can re­place MWI in the rel­a­tivis­tic con­text, but the view is in­suffi­ciently de­vel­oped (at least in that pa­per) for me to fully eval­u­ate it.

A quick sum­mary of the is­sues raised in the pa­per:

• In NRQM, the wave func­tion lives in con­figu­ra­tion space, but there is no well-defined par­ti­cle con­figu­ra­tion space in QFT since par­ti­cle num­ber is not con­served and par­ti­cles are emer­gent en­tities with­out pre­cisely defined phys­i­cal prop­er­ties.

• A move to field con­figu­ra­tion space is un­satis­fac­tory be­cause quan­tum field the­o­ries ad­mit of equiv­a­lent de­scrip­tion us­ing many differ­ent choices of field ob­serv­able. Un­like NRQM, where there are solid dy­nam­i­cal rea­sons for choos­ing the po­si­tion ba­sis as fun­da­men­tal, there seems to be no nat­u­ral or dy­nam­i­cally preferred choice in QFT, so a choice of a par­tic­u­lar field con­figu­ra­tion space de­scrip­tion would amount to ad hoc priv­ileg­ing.

• MWI in NRQM treats phys­i­cal space as non-fun­da­men­tal. This is hard to jus­tify in QFT, be­cause phys­i­cal space-time is bound up with the fun­da­men­tals of the the­ory to a much greater de­gree. The dy­nam­i­cal vari­ables in QFT are op­er­a­tors that are ex­plic­itly as­so­ci­ated with space-time re­gions.

• This ob­jec­tion is par­tic­u­larly clever and in­ter­est­ing, I think. In MWI, the his­tory of the uni­verse is fully speci­fied by giv­ing the uni­ver­sal wave­func­tion at each time in some refer­ence frame. In a rel­a­tivis­tic con­text, one would ex­pect that all one needs to do in or­der to de­scribe how the uni­verse looks in some other in­er­tial refer­ence frame is to perform a Lorentz trans­for­ma­tion on this his­tory. If the his­tory re­ally tells us ev­ery­thing about the phys­i­cal state of the uni­verse, then it gives us all the in­for­ma­tion re­quired to de­ter­mine how the uni­verse looks un­der a Lorentz trans­for­ma­tion. But in rel­a­tivis­tic quan­tum me­chan­ics, this is not true. Fully spec­i­fy­ing the wave­func­tion (defined on an ar­bi­trar­ily cho­sen field con­figu­ra­tion space, say) at all times is not suffi­cient to de­ter­mine what the uni­verse will look like un­der a Lorentz trans­for­ma­tion. See the ex­am­ple on p. 21 in the pa­per, or read David Albert’s pa­per on nar­rata­bil­ity. This sug­gests that giv­ing the wave­func­tion at all times is not a full speci­fi­ca­tion of the phys­i­cal prop­er­ties of the uni­verse.

On the other hand, my un­der­stand­ing is that QFT it­self doesn’t ex­ist in a rigor­ous form yet, ei­ther.

I as­sume you’re refer­ring to the in­fini­ties that arise in QFT when we in­te­grate over ar­bi­trar­ily short length scales. I don’t think this shows a lack of rigor in QFT. Thanks to the de­vel­op­ment of renor­mal­iza­tion group the­ory in the 70s, we know how to do func­tional in­te­grals in QFT with an im­posed cut­off at some finite short length scale. QFT with a cut­off doesn’t suffer from prob­lems in­volv­ing in­fini­ties. Of course, the ne­ces­sity of the cut­off is an in­di­ca­tion that QFT is not a com­pletely ac­cu­rate de­scrip­tion of the uni­verse. But we already know that we’re go­ing to need a the­ory of quan­tum grav­ity at the Planck scale. In the do­main where it works, QFT is rea­son­ably rigor­ously defined, I’d say.

• This ar­ti­cle (PDF) gives a nice (and fairly ac­cessible) sum­mary of some of the is­sues in­volved in ex­tend­ing MWI to QFT.

Thanks for that; it’s quite an in­ter­est­ing ar­ti­cle, and I’m still try­ing to ab­sorb it. How­ever, one thing that seems pretty clear to me is that for EY’s in­tended philo­soph­i­cal pur­poses, there re­ally is no im­por­tant dis­tinc­tion be­tween “wave­func­tion re­al­ism” (in the con­text of NRQM) and “space­time state re­al­ism” (in the con­text of QFT). Espe­cially since I con­sider this post to be mostly wrong: lo­cal­ity in con­figu­ra­tion space is what mat­ters, and con­figu­ra­tion space is a vec­tor space (speci­fi­cally a Hilbert space) -- there is no preferred (or­thonor­mal) ba­sis.

I as­sume you’re refer­ring to the in­fini­ties that arise in QFT when we in­te­grate over ar­bi­trar­ily short length scales. I don’t think this shows a lack of rigor in QFT

If the “prob­lem” is merely that cer­tain in­te­grals are di­ver­gent, then I agree. No one says that the fact that $\\int\_\{0\}^\{1\} \\frac\{1\}\{x\} \\, dx$ di­verges shows a lack of rigor in real anal­y­sis!

What con­cerns me is whether any ac­tual math­e­mat­i­cal lies are be­ing told—such as in­te­grals be­ing as­sumed to con­verge when they haven’t yet been proved to do so. Or some­thing like the early his­tory of the Dirac delta, when physi­cists unashamedly spoke of a “func­tion” with prop­er­ties that a func­tion can­not, in fact, have.

If QFT is merely a phys­i­cal lie—i.e., “not a com­pletely ac­cu­rate de­scrip­tion of the uni­verse”—and not a math­e­mat­i­cal one, then that’s a differ­ent mat­ter, and I wouldn’t call it an is­sue of “rigor”.

• How­ever, one thing that seems pretty clear to me is that for EY’s in­tended philo­soph­i­cal pur­poses, there re­ally is no im­por­tant dis­tinc­tion be­tween “wave­func­tion re­al­ism” (in the con­text of NRQM) and “space­time state re­al­ism” (in the con­text of QFT).

I’m a lit­tle un­clear about what EY’s in­tended philo­soph­i­cal pur­poses are in this con­text, so this might well be true. One pos­si­ble prob­lem worth point­ing out is that space­time state re­al­ism in­volves an aban­don­ment of a par­tic­u­lar form of re­duc­tion­ism. Whether or not EY is com­mit­ted to this form of re­duc­tion­ism some­body more fa­mil­iar with the se­quences than I would have to judge.

Ac­cord­ing to space­time state re­al­ism, the phys­i­cal state of a space­time re­gion is not su­per­ve­nient on the phys­i­cal states of its sub­re­gions, i.e. the phys­i­cal state of a space­time re­gion could be differ­ent with­out any of its sub­re­gions be­ing in differ­ent states. This is be­cause sub­re­gions can be en­tan­gled with one an­other in differ­ent ways with­out al­ter­ing their lo­cal states. This is not true of wave­func­tion re­al­ism set in con­figu­ra­tion space. There, the only way a re­gion of con­figu­ra­tion space could have differ­ent phys­i­cal prop­er­ties is if some of its sub­re­gions had differ­ent prop­er­ties.

Also, I think it’s pos­si­ble that the fact that the differ­ent “wor­lds” in space­time state re­al­ism are spa­tially over­lap­ping (as op­posed to wave­func­tion re­al­ism, where they are sep­a­rated in con­figu­ra­tion space) might lead to in­ter­est­ing con­cep­tual differ­ences be­tween the two in­ter­pre­ta­tions. I haven’t thought about this enough to give spe­cific rea­sons for this sus­pi­cion, though.

Espe­cially since I con­sider this post to be mostly wrong: lo­cal­ity in con­figu­ra­tion space is what mat­ters, and con­figu­ra­tion space is a vec­tor space (speci­fi­cally a Hilbert space) -- there is no preferred (or­thonor­mal) ba­sis.

I’m not sure ex­actly what you’re say­ing here, but if you’re re­ject­ing the claim that MWI priv­ileges a par­tic­u­lar ba­sis, I think you’re wrong. Of course, you could treat con­figu­ra­tion space it­self as if it had no preferred ba­sis, but this would still amount to priv­ileg­ing po­si­tion over mo­men­tum. You can’t go from po­si­tion space to mo­men­tum space by a change of co­or­di­nates in con­figu­ra­tion space. Con­figu­ra­tion space is always a space of pos­si­ble par­ti­cle po­si­tion con­figu­ra­tions, no mat­ter how you trans­form the co­or­di­nates.

I think you might be con­flat­ing con­figu­ra­tion space with the Hilbert space of wave­func­tions on con­figu­ra­tion space. In this lat­ter space, you can trans­form from a ba­sis of po­si­tion eigen­states to a ba­sis of mo­men­tum eigen­states with a co­or­di­nate trans­for­ma­tion. But this is not con­figu­ra­tion space it­self, it is the space of square in­te­grable func­tions on con­figu­ra­tion space. [I’m ly­ing a lit­tle for sim­plic­ity: Po­si­tion and mo­men­tum eigen­states aren’t ac­tu­ally square in­te­grable func­tions on con­figu­ra­tion space, but there are var­i­ous math­e­mat­i­cal tricks to get around this com­pli­ca­tion.]

What con­cerns me is whether any ac­tual math­e­mat­i­cal lies are be­ing told—such as in­te­grals be­ing as­sumed to con­verge when they haven’t yet been proved to do so. Or some­thing like the early his­tory of the Dirac delta, when physi­cists unashamedly spoke of a “func­tion” with prop­er­ties that a func­tion can­not, in fact, have.

If this is your stan­dard for lack of rigor, then per­haps QFT hasn’t been rigor­ously for­mu­lated yet, but the same would hold of pretty much any phys­i­cal the­ory. I think you can find places in pretty much ev­ery the­ory where some such “math­e­mat­i­cal lie” is re­lied upon. There’s an ex­am­ple of a stan­dard math­e­mat­i­cal lie told in NRQM ear­lier in my post.

In many of these cases, math­e­mat­i­ci­ans have for­mu­lated more rigor­ous ver­sions of the rele­vant proofs, but I think most physi­cists tend to be blithely ig­no­rant of these math­e­mat­i­cal re­sults. Maybe QFT isn’t rigor­ously for­mu­lated ac­cord­ing to the math­e­mat­i­cian’s stan­dards of rigor, but it meets the physi­cist’s lower stan­dards of rigor. There’s a rea­son most physi­cists work­ing on QFT are un­in­ter­ested in things like Alge­braic Quan­tum Field The­ory.

• I’m a lit­tle un­clear about what EY’s in­tended philo­soph­i­cal pur­poses are in this context

As I read him, he mainly wants to make the point that “sim­plic­ity” is not the same as “in­tu­itive­ness”, and the former trumps the lat­ter. It may seem more “hu­manly nat­u­ral” for there to be some mag­i­cal pro­cess caus­ing wave­func­tion col­lapse than for there to be a pro­lifer­a­tion of “wor­lds”, but be­cause the lat­ter doesn’t re­quire any ad­di­tions to the equa­tions, it is strictly sim­pler and thus fa­vored by Oc­cam’s Ra­zor.

I think you might be con­flat­ing con­figu­ra­tion space with the Hilbert space of wave­func­tions on con­figu­ra­tion space.

Yes, sorry. What I ac­tu­ally meant by “con­figu­ra­tion space” was “the Hilbert space that wave­func­tions are el­e­ments of”. That space, what­ever you call it (“state space”?), is the one that mat­ters in the con­text of “wave­func­tion re­al­ism”.

(This ex­plains an oth­er­wise puz­zling pas­sage in the ar­ti­cle you linked, which con­trasts the “con­figu­ra­tion space” and “Hilbert space” for­mal­isms; but on the other hand, it re­duces my cre­dence that EY knows what he’s talk­ing about in the QM se­quence, since he doesn’t seem to talk about the space-that-wave­func­tions-are-el­e­ments-of much at all.)

If this is your stan­dard for lack of rigor, then per­haps QFT hasn’t been rigor­ously for­mu­lated yet, but the same would hold of pretty much any phys­i­cal theory

This is con­trary to my un­der­stand­ing. I was un­der the im­pres­sion that clas­si­cal me­chan­ics, gen­eral rel­a­tivity, and NRQM had all by now been given rigor­ous math­e­mat­i­cal for­mu­la­tions (in terms of sym­plec­tic ge­om­e­try, Lorentzian ge­om­e­try, and the the­ory of op­er­a­tors on Hilbert space re­spec­tively).

Maybe QFT isn’t rigor­ously for­mu­lated ac­cord­ing to the math­e­mat­i­cian’s stan­dards of rigor, but it meets the physi­cist’s lower stan­dards of rigor. There’s a rea­son most physi­cists work­ing on QFT are un­in­ter­ested in things like Alge­braic Quan­tum Field The­ory.

The math­e­mat­i­cian’s stan­dards are what in­ter­ests me, and are what I mean by “rigor”. I don’t con­sider it a virtue on the part of physi­cists that they are un­aware of or un­in­ter­ested in the math­e­mat­i­cal foun­da­tions of physics, even if they are able to get away with be­ing so un­in­ter­ested. There is a rea­son math­e­mat­i­ci­ans have the stan­dards of rigor they do. (And it should of course be said that some physi­cists are in­ter­ested in rigor­ous math­e­mat­ics.)

• This is a very good post, but I won­der: One of the au­thors in the pa­per you cite is David Wal­lace, per­haps the most promi­nent pro­po­nent of mod­ern Everettian in­ter­pre­ta­tion. He just pub­lished a new book called “The Emer­gent Mul­ti­verse” and he claims there is no prob­lem unify­ing MWI with QFT be­cause in­ter­ac­tions within wor­lds are lo­cal and only states are non­lo­cal. I have yet to hear him men­tion any need for se­ri­ous re­for­mu­la­tion of any­thing in terms of MWI.

You said you sus­pect this is nec­es­sary, but that you hope we can re­cover a similar MWI, but isn’t it more rea­son­able to ex­pect that at the planck scale some­thing else will ex­plain the quan­tum weird­ness? After all if MWI fails both prob­a­bil­ity and rel­a­tivity, then there is no good rea­son to sus­pect that this in­ter­pre­ta­tion is cor­rect.

Have you given Ger­ard ’t Hoofts idea of cel­lu­lar au­tomata which he claims sal­vage de­ter­minism, lo­cal­ity and re­al­ism any thought?

• You said you sus­pect this is nec­es­sary, but that you hope we can re­cover a similar MWI, but isn’t it more rea­son­able to ex­pect that at the planck scale some­thing else will ex­plain the quan­tum weird­ness?

When I talk about re­cov­er­ing MWI, I re­ally just mean ab­sorb­ing the les­son that our the­ory does not need to de­liver de­ter­mi­nate mea­sure­ment re­sults, and ad hoc tools for satis­fy­ing this con­straint (such as col­lapse or hid­den vari­ables) are otiose. Of course, the foun­da­tions of our even­tual the­ory of quan­tum grav­ity might be differ­ent enough from those of quan­tum the­ory that the in­ter­pre­ta­tional op­tions don’t trans­late. How differ­ent the foun­da­tions will be de­pends on which pro­gram ends up work­ing out, I sus­pect. If some­thing like canon­i­cal quan­tum grav­ity or loop quan­tum grav­ity turns out to be the way to go, then I think a lot of the con­cep­tual work done in in­ter­pret­ing NRQM and QFT will carry over. If string the­ory turns out to be on the right track, then maybe a more rad­i­cal in­ter­pre­ta­tional re­vi­sion will be re­quired. The foun­da­tions of string the­ory are now thought to lie in M-the­ory, and the na­ture of this the­ory is still pretty con­cep­tu­ally opaque. It’s worth not­ing though that Bousso and Susskind have ac­tu­ally sug­gested that string the­ory pro­vides a solid foun­da­tion for MWI, and that the wor­lds in the string the­ory land­scape are the same thing as the wor­lds in MWI. See here for more on this. The pa­per has been on my “to read” list for a while, but I haven’t got­ten around to it yet. I’m skep­ti­cal but in­ter­ested.

Have you given Ger­ard ’t Hoofts idea of cel­lu­lar au­tomata which he claims sal­vage de­ter­minism, lo­cal­ity and re­al­ism any thought?

I know of ‘t Hooft’s cel­lu­lar au­tomata stuff, but I don’t know much about it. Speak­ing from a po­si­tion of ad­mit­ted ig­no­rance, I’m skep­ti­cal. I sus­pect the only way to con­struct a gen­uinely de­ter­minis­tic lo­cal re­al­ist the­ory that re­pro­duces quan­tum statis­tics is to em­brace su­perde­ter­minism in some form, i.e. to place con­straints on the bound­ary con­di­tions of the uni­verse that make the statis­tics work out by hand. This move doesn’t seem like good physics prac­tice to me. Do you know if ’t Hooft’s strat­egy re­lies on some similar move?

• ’t Hooft’s lat­est pa­per is the first in which he maps a full QFT to a CA, and the QFT in ques­tion is a free field the­ory. So I think that in this case he evades Bell’s the­o­rem, quan­tum com­plex­ity the­o­rems, etc, by work­ing in a the­ory where phys­i­cal de­tec­tors, quan­tum com­put­ers, etc don’t ex­ist, be­cause in­ter­ac­tions don’t ex­ist. It’s like how you can evade the in­com­plete­ness the­o­rems if your ar­ith­metic only has ad­di­tion but not mul­ti­pli­ca­tion. Else­where he does ap­peal to su­per­s­e­lec­tion /​ cos­molog­i­cal ini­tial con­di­tions as a way to avoid cat states (macro­scopic su­per­po­si­tions), but I don’t see that play­ing a role here.

The map­ping it­self has some­thing to do with fo­cus­ing on the frac­tional part of par­ti­cle mo­men­tum as finite, and avoid­ing di­ver­gences by fo­cus­ing on a par­tic­u­lar sub­space. It’s not a triv­ial re­sult. But ex­tend­ing it to in­ter­act­ing field the­ory will re­quire new ideas, e.g. mak­ing the state space of each in­di­vi­d­ual cell in the CA into a Fock space, or per­mit­ting CTCs in the CA grid. Surely you need rad­i­cal in­gre­di­ents like that in or­der to re­cover the full quan­tum state space…

• Aha, I see. So you do not share EY’s view that MWI is “cor­rect” then and the only prob­lem it faces is re­cov­er­ing the Born Rule? I agree that ob­vi­ously what will end up work­ing will de­pend on what the foun­da­tions are :) I re­mem­ber that pa­per by Bu­osso and Susskind, I even re­mem­ber send­ing a mail to Susskind about it, while at the same time ask­ing him about his opinion of ‘t Hoofts work. If I re­mem­ber cor­rectly the pa­per was dis­cussed at some length over at physics­fo­rums.com (can’t re­mem­ber the post) and it seemed that the con­sen­sus was that the au­thors have mis­in­ter­preted de­co­her­ence in some way. I don’t re­mem­ber the de­tails, but the fact that the pa­per it­self has not been men­tioned or cited in any ar­ti­cle I have read since then in­di­cates to me that there has had to have been some se­ri­ous er­ror in it. Also Susskinds an­swer re­gard­ing ’t Hoofts work was illu­mi­nat­ing. To para­phrase he said he felt that ’t Hooft might be cor­rect, but due to there not be­ing any pre­dic­tions it was hard to hold a strong opinion ei­ther way on the mat­ter. So it seems Susskind was not very sold on his own idea.

Ger­ard ‘t Hooft ac­tu­ally does rely on what peo­ple call “su­perde­ter­minism”, which I just call “full de­ter­minism”, which I think is also a term ’t Hooft likes more. At least that is what his pa­pers in­di­cate. He dis­cuss this some in a ar­ti­cle from 2008 in re­sponse to Si­mon Kochen and John Con­way’s Free Will The­o­rem. You might want to read the ar­ti­cle: http://​​www.sci­ence­news.org/​​view/​​generic/​​id/​​35391/​​ti­tle/​​Math_Trek__Do_sub­atomic_par­ti­cles_have_free_will%3F After that you might want to head on over to arxiv, ’t Hooft has pub­lished a 3 pa­pers the last 6 months on this is­sue and he seem more and more cer­tain of it. He also adress the ob­jec­tions in some notes in those pa­pers. Link: http://​​arxiv.org/​​find/​​quant-ph/​​1/​​au:+Hooft_G/​​0/​​1/​​0/​​all/​​0/​​1

• There is the stan­dard MWI ad­vo­cacy that matches Elieser’s views. This is a cri­tique of this ad­vo­cacy, point by point. See es­pe­cially Q14, re QFT. This gives a rea­son why MWI is not a use­ful ob­ject of study.

• This is a cri­tique of this ad­vo­cacy, point by point. See es­pe­cially Q14, re QFT. This gives a rea­son why MWI is not a use­ful ob­ject of study.

The first cri­tique seems to crit­i­cize some­thing differ­ent that Eliezer says. It seems like the per­son quoted by the au­thor did not ex­press them­selves clearly, and the cri­tique takes a wrong ex­pla­na­tion. For ex­am­ple this part:

When do wor­lds split?

The pre­cise mo­ment/​lo­ca­tion of the split is not sharply defined due to the sub­jec­tive na­ture of ir­re­versibil­ity, but can be con­sid­ered com­plete when much more than kT of en­ergy has been re­leased in an un­con­trol­led fash­ion into the en­vi­ron­ment. At this stage the event has be­come ir­re­versible.

How can ir­re­versibil­ity be sub­jec­tive if it defines what a mea­sure­ment is and when wor­lds split? It would im­ply that when wor­lds split is also a sub­jec­tive mat­ter. But then it is ob­server-de­pen­dent, the very thing the in­ter­pre­ta­tion is try­ing to avoid.

For me the Eliezer’s ex­pla­na­tion of “blobs of am­pli­tude” makes sense. There is a set of pos­si­ble con­figu­ra­tions, which at the be­gin­ning are all very similar, but be­cause some in­ter­ac­tions make the differ­ences grow, the set grad­u­ally sep­a­rates into smaller sub­sets. When ex­actly? Well, in the­ory the parts are con­nected for­ever, but the con­nec­tion only has ep­silon size re­lated to the sub­sets, so it can be ig­nored. But ask­ing when ex­actly is like ask­ing “what ex­actly is the largest num­ber that can be con­sid­ered ‘al­most zero’?”. If you want to be ex­act, only zero is ex­actly zero. On the other hand, 1/​3^^^3 is for all prac­ti­cal pur­poses zero. I would feel un­com­frotable pick­ing one num­ber and say­ing “ok, this X is ‘al­most zero’, but 1.000001 X is not ‘al­most zero’”.

The quoted per­son seems to say some­thing similar, just less clearly, which al­lows the critic to use the word “sub­jec­tive” and jump to a wrong con­clu­sion that au­thor is say­ing that math­e­mat­ics is ob­server-de­pen­dent. (Analog­i­cally, just be­cause you and me can have differ­ent in­ter­pre­ta­tions of ‘al­most zero’, that does not mean math­e­mat­ics is sub­jec­tive and ob­server-de­pended. It just means that ‘al­most zero’ is not ex­actly defined, but in real life we care whether e.g. the wa­ter we drink con­tains ‘al­most zero’ poi­son.)

So gen­er­ally for me it means that once some­one fa­mous says a wrong (or just am­bigu­ous) ex­pla­na­tion of MWI, that ex­pla­na­tion will be for­ever used as an ar­gu­ment against any­thing similar to MWI.

• This gives a rea­son why MWI is not a use­ful ob­ject of study.

Well, not quite. Some­one ought to be think­ing about this sort of stuff, and the claim that link makes is that MWI isn’t worth con­sid­er­ing be­cause it goes against the “sci­en­tific ethos.”

The rea­son I would tell peo­ple why MWI is not a use­ful ob­ject of study (for them) is be­cause un­til you make it a dis­agree­ment about the ter­ri­tory, dis­agree­ing about maps cashes out as squab­bling. How you in­ter­pret QM should not mat­ter, so don’t waste time on it.

• be­cause un­til you make it a dis­agree­ment about the ter­ri­tory, dis­agree­ing about maps cashes out as squab­bling.

Tell that to EY.

• On the other hand, my un­der­stand­ing is that QFT it­self doesn’t ex­ist in a rigor­ous form yet, ei­ther.

Depends on what you mean by rigor­ous. (OTOH, it’s not fully com­pat­i­ble with gen­eral rel­a­tivity, so we know it doesn’t ex­actly de­scribe the world—or that GR doesn’t, or that nei­ther does.)

• If you bug physi­cists enough, they will ad­mit that the stan­dard model has some prob­lems, like the Lan­dau pole. How­ever, there are toy QFTs in 2 spa­cial di­men­sion that have mod­els rigor­ous enough for math­e­mat­i­ci­ans. That should be ad­e­quate for philo­soph­i­cal pur­poses.

• I don’t think the Lan­dau pole can be char­ac­ter­ized as an ac­tual prob­lem. It was con­sid­ered a prob­lem for strong in­ter­ac­tions, but we now know that quan­tum chron­o­dy­nam­ics is asymp­tot­i­cally free, so it does not have a Lan­dau pole. The Lan­dau pole for quan­tum elec­tro­dy­nam­ics is at an en­ergy scale much much higher than the Planck en­ergy. We already know that we need new physics at the Planck scale, so the lack of asymp­totic free­dom in the Stan­dard Model is not a real prac­ti­cal (or even con­cep­tual) prob­lem.

• The Lan­dau pole for QED goes away when cou­pled with QCD, but I be­lieve an­other one ap­pears with the Higgs field.

If you don’t like the ques­tion I’m an­swer­ing, com­plain to Kom­pon­isto, not me.
But what would you count as a con­cep­tual prob­lem?

• If you don’t like the ques­tion I’m an­swer­ing, com­plain to Kom­pon­isto, not me.

I wasn’t com­plain­ing to any­one. And I don’t dis­like the ques­tion. I was just adding some rele­vant in­for­ma­tion. Any­way, I did re­ply di­rectly to kom­pon­isto as well. See the end of my long com­ment above.

But what would you count as a con­cep­tual prob­lem?

If we did not have in­de­pen­dent ev­i­dence that QFT breaks down at the Planck scale (since grav­ity is not renor­mal­iz­able), I might have con­sid­ered the Lan­dau pole a con­cep­tual prob­lem for QFT. But since it is only a prob­lem in a do­main where we already know QFT doesn’t work, I don’t see it that way.

• I don’t think that’s the nor­mal use of “con­cep­tual prob­lem.”

If physi­cists be­lieve, as their ver­biage seems to in­di­cate, that QED is a real the­ory that is an ap­prox­i­ma­tion to re­al­ity, and they com­pute ap­prox­i­ma­tions to the num­bers in QED, while QED is ac­tu­ally in­con­sis­tent, I would say that is an er­ror and a paradig­matic ex­am­ple of a con­cep­tual er­ror.

What does it mean to in­ter­pret an in­con­sis­tent the­ory?

• if all I can do is push the bad­ness out of my path, and into some other path

MWI doesn’t say any­thing like that. Noth­ing in physics says any­thing about “bad­ness” or “good­ness”.

Well, ex­cept in­so­far as hu­mans run on physics, and as such can be de­scribed by physics.

• I can’t change the fun­da­men­tal amount of good­ness, I can just push it around.

Wrong (even when as­sum­ing there is an ex­act defi­ni­tion of good­ness).

You can’t fix all branches of the uni­verse, be­cause (1) in most branches you don’t ex­ist, and (2) in a very few branches to­tally ran­dom events may pre­vent your ac­tions. But this does not mean that your ac­tions don’t in­crease the amount of good­ness.

First, you are re­spon­si­ble only for the branches where you ex­isted, so let’s just re­move the other branches from our moral equa­tion. Se­cond, the ex­cep­tion­ally ran­dom events hap­pen only in ex­cep­tion­ally small pro­por­tion of branches. So even if some kind of Maxwell’s de­mon can ruin your ac­tions in 0.000 … … … 001 of branches, there are stil 0.999 … … … 999 of branches where your ac­tions worked nor­mally. And im­prov­ing such ma­jor­ity of branches is a good thing.

In each world, peo­ple choose the course that seems best to them. Maybe they hap­pen on a differ­ent line of think­ing, and see new im­pli­ca­tions or miss oth­ers, and come to a differ­ent choice. But it’s not that one world chooses each choice. It’s not that one ver­sion of you chooses what seems best, and an­other ver­sion chooses what seems worst. In each world, ap­ples go on fal­ling and peo­ple go on do­ing what seems like a good idea.

In all the wor­lds, peo­ple’s choices de­ter­mine out­comes in the same way they would in just one sin­gle world. The choice you make here does not have some strange bal­anc­ing in­fluence on some world el­se­where. There is no causal com­mu­ni­ca­tion be­tween de­co­her­ent wor­lds. In each world, peo­ple’s choices con­trol the fu­ture of that world, not some other world. If you can imag­ine de­ci­sion­mak­ing in one world, you can imag­ine de­ci­sion-mak­ing in many wor­lds: just have the world con­stantly split­ting while oth­er­wise obey­ing all the same rules.

• Well, lets say we posit some start­ing con­di­tion, say the con­di­tion of the uni­verse on the day I turned 17. I am down one path from that ini­tial con­di­tion, and a great many other wor­lds ex­ist in which things went a lit­tle differ­ently. I take it that it’s not (un­for­tu­nately) a phys­i­cal or log­i­cal im­pos­si­bil­ity that in one or more of those branches, I have ten years down the line com­mit­ted a mur­der.

Now, there are a finite num­ber of mur­der-paths, and a finite num­ber of non-mur­der-paths, and my path is iden­ti­cal to one of them. But it seems to me that whether or not I mur­der some­one, the to­tal num­ber of mur­der-paths and the to­tal num­ber of non-mur­der-paths is the same? Is this to­tally off base? I hope that it is.

Any­way, if that’s true, then by not mur­der­ing, all I’ve done is put my­self off of a mur­der-path. There’s one less mur­der in my world, but not one less mur­der ab­solutely. So, fine, live in my world and don’t worry about the oth­ers. But whence that rule? That seems ar­bi­trary, and I’m not al­lowed to ap­ply it in or­der to lo­cal­ize my eth­i­cal con­sid­er­a­tions in any other case.

• On a macro level, a Many Wor­lds model should be math­e­mat­i­cally equal to One World + Prob­a­bil­ities model. Be­ing un­happy that in 0.01% of Many Wor­lds you are a mur­derer, is like be­ing un­happy that with prob­a­bil­ity 0.01% you are a mur­derer in One World. The differ­ence is that in One World you can later say “I was lucky” or “I was un­lucky”, while in the Many Wor­lds model you can just say “this is a lucky branch” or “this is an un­lucky branch”.

But it seems to me that whether or not I mur­der some­one, the to­tal num­ber of mur­der-paths and the to­tal num­ber of non-mur­der-paths is the same?

At this point it seems to me that you are mix­ing a Many Wor­lds model with a naive de­ter­minism, and the prob­lem is with the naive de­ter­minism. Imag­ine say­ing this: “on the day I turned 17, there is one fixed path to­wards the fu­ture, where I ei­ther com­mit a mur­der or don’t, and the re­sult is the same what­ever I do”. Is this right, or wrong, or con­fused, or...? Be­cause this is what you are say­ing, just adding Many Wor­lds. The differ­ence is that in One World model, if you say “I will flip a coin, and based on the re­sult I will kill him or not” and you mean it, then you are a mur­derer with prob­a­bil­ity 50%, while in Many Wor­lds you are a mur­derer in 50% of branches. (Of course with the naive de­ter­minism the prob­a­bil­ity is also only in mind—you were already de­ter­mined to throw the coin with given di­rec­tion and speed.)

Sim­ply speak­ing, in Many Wor­lds model all prob­a­bil­ities hap­pen, but higher prob­a­bil­ities hap­pen “more” and lower prob­a­bil­ities hap­pen “less”. You don’t want to be a mur­derer? Then be­have so that your prob­a­bil­ity of mur­der­ing some­one is as small as pos­si­ble! This is equally valid ad­vice for One World and Many Wor­lds.

So, fine, live in my world and don’t worry about the oth­ers. But whence that rule?

Be­cause you can’t in­fluence what hap­pen in the other branches. How­ever, if you did some­thing that could lead with some prob­a­bil­ity to other per­son’s death (e.g. shoot­ing at them and miss­ing them), you should un­der­stand that it was a bad thing which made you (in some other branch) a mur­derer, so you should not do that again (but nei­ther should you do that again in One World). On the other hand, if you did some­thing that could lead to a good out­come, but you ran­domly failed, you did (in some other branch) a good thing. (Care­ful! You have a big bias to over­es­ti­mate the prob­a­bil­ity of the good out­come. So don’t re­ward your­self too much for try­ing.)

• Be­ing un­happy that in 0.01% of Many Wor­lds you are a mur­derer, is like be­ing un­happy that with prob­a­bil­ity 0.01% you are a mur­derer in One World.

That doesn’t seem plau­si­ble. If there’s a 0.01% prob­a­bil­ity that I’m a mur­derer (and there is only one world) then if I’m not in fact a mur­derer, I have com­mit­ted no mur­ders. If there are many wor­lds, then I have com­mit­ted no mur­ders in this world, but the ‘me’ in an­other world (who’se path ap­prox­i­mates mine to the ex­tent that would call that per­son ‘me’) in fact is a mur­derer. It seems like a differ­ence be­tween some mur­ders and no mur­ders.

Be­cause this is what you are say­ing, just adding Many Wor­lds.

I’m say­ing that de­pend­ing on what I do, I end up in a non-mur­der path or a mur­der path. But noth­ing I do can change the num­ber of non-mur­der or mur­der paths. So it’s not de­ter­minis­tic as re­gards my po­si­tion in this se­lec­tion, just de­ter­minis­tic as re­gards the se­lec­tion it­self. I can’t causally in­ter­act with other wor­lds, so my not mur­der­ing in one world has no effect on any other wor­lds. If there are five mur­der wor­lds branch­ing off from my­self at 17, then there are five no mat­ter what. Maybe I can ad­just that num­ber prior to the day I turn 17, but there’s still a fixed num­ber of mur­der wor­lds ex­tend­ing from the day I was born, and there’s noth­ing I can do to change that. Is that a faulty case of de­ter­minism?

Be­cause you can’t in­fluence what hap­pen in the other branches.

That’s a good point. Would you be will­ing to com­mit to an a pri­ori eth­i­cal prin­ci­ple such that ought im­plies can?

• If there are five mur­der wor­lds branch­ing off from my­self at 17, then there are five no mat­ter what.

That’s equiv­a­lent to say­ing “if at the mo­ment of my 17th birth­day there is a prob­a­bil­ity 5% that I will mur­der some­one, then in that mo­ment there is a prob­a­bil­ity 5% that I will mur­der some­one no mat­ter what”. I agree with this.

there’s still a fixed num­ber of mur­der wor­lds ex­tend­ing from the day I was born, and there’s noth­ing I can do to change that.

That’s equiv­a­lent to say­ing “if at the day I was born there is an X% chance that I will be­come a mur­derer, there is noth­ing I can do to change that prob­a­bil­ity on that day”. True; you can’t travel back in time and cre­ate a coun­ter­fac­tual uni­verse.

Is that a faulty case of de­ter­minism?

It is ex­plained here, with­out the Many Words.

Short sum­mary: You are mix­ing to­gether two differ­ent views—time­ful and time­less view. In time­ful view you can say “to­day at 12:00 I de­cided to kill my neigh­bor”, and it makes sense. Then you switch to a po­si­tion of a ceiling cat, an in­de­pen­dent ob­server out­side of our uni­verse, out­side of our time, and say “I can­not change the fact that to­day at 12:00 I kil­led my neigh­bor”. Yes, it also makes sense; if some­thing hap­pened, it can­not non-hap­pen. But we con­fus­ing two nar­ra­tors here: the real you, and the ceiling cat. You de­cided to kill your neigh­bor. The ceiling cat can­not de­cide that you didn’t, be­cause the ceiling cat does not live in this uni­verse; it can only ob­serve what you did. The rea­son you kil­led your neigh­bor is that you, ex­ist­ing in this uni­verse, have de­cided to do so. You are the cause. The ceiling cat sees your ac­tion as de­ter­mined, be­cause it is out­side of the uni­verse.

If we ap­ply it to Many World hy­poth­e­sis, there are 100 differ­ent yous, and one ceiling cat. From those, 5 yous com­mit mur­der (be­cause they de­cided to do so), and 95 don’t (be­cause they de­cided oth­er­wise, or just failed to mur­der suc­cess­fully). In­side the uni­verses, the 5 yous are mur­der­ers, the 95 are not. The ceiling cat may de­cide to blame those 95 for the ac­tions of those 5, but that’s the ceiling cat’s de­ci­sion. It should at least give you credit for keep­ing the ra­tio 5:95 in­stead of e.g. 50:50.

Would you be will­ing to com­mit to an a pri­ori eth­i­cal prin­ci­ple such that ought im­plies can?

That’s tricky. In some sense, we can’t do any­thing un­less the atoms in our bod­ies do it; and our atoms are fol­low­ing that laws of physics. In some sense, there is no such thing as “can”, if we want to ex­am­ine things on the atom level. (And that’s equally true in Many Wor­lds as in One World; only in One World there is also a ran­dom­ness in the equa­tions.) In other sense, hu­mans are de­ci­sion-mak­ers. But we are de­ci­sion-mak­ers built from atoms, not de­ci­sion-mak­ers about the atoms we are built from.

So my an­swer would be that “ought” im­plies psy­cholog­i­cal “can”; not atomic “can”. (Be­cause the whole ethics ex­ists on psy­cholog­i­cal level, not on atomic level.)

• Short sum­mary: You are mix­ing to­gether two differ­ent views—time­ful and time­less view.

This sounds right to me, and I think your sub­se­quent anal­y­sis is on tar­get. So we have two views, the time­less view and the time­ful view and we can’t (at least di­rectly) trans­late eth­i­cal prin­ci­ples like ‘min­i­mize evils’ across the views. So say we grant this and move on from here. Maybe my ques­tion is just that the time­less view is one in which ethics seems to make no sense (or at least not the same kind of sense), and the time­ful view is a view in which it is a press­ing con­cern. Would you ob­ject to that?

• the time­less view is one in which ethics seems to make no sense

I didn’t fully re­al­ize that pre­vi­ously, but yes—in the time­less view there is no time, no change, no choice. Ethics is all about choices.

Eth­i­cal rea­son­ing only makes sense in time, be­cause the pro­cess of eth­i­cal rea­son­ing is mov­ing the par­ti­cles in your brain, and the phys­i­cal con­se­quence of that can be a good or evil ac­tion. Ethics can have an in­fluence on uni­verse only if it is a part of the uni­verse. The whole uni­verse is de­ter­mined only by its laws and its con­tents. The only way ethics can act is through the brains of peo­ple who con­tem­plate it. Ethics is a hu­man product (though we can dis­cuss how much free­dom did we have in cre­at­ing this product; whether it would be differ­ent if we had a differ­ent his­tory or biol­ogy) and it makes sense only on the hu­man level, not on the level of par­ti­cles.

• I just stick with the time­less view and don’t have any trou­ble with ethics in it, but that’s be­cause I’ve got all the phe­nom­ena of time fully em­bed­ded in the time­less view, in­clud­ing choice and moral­ity. :)

• Ethics is a hu­man product (though we can dis­cuss how much free­dom did we have in cre­at­ing this product; whether it would be differ­ent if we had a differ­ent his­tory or biol­ogy) and it makes sense only on the hu­man level, not on the level of par­ti­cles.

I’m happy with the idea that ethics is a hu­man product (since this doesn’t im­ply that it’s ar­bi­trary or illu­sory or any­thing like that). I take this to mean, ba­si­cally, that ethics con­cerns the re­la­tion of some sub­sys­tems with oth­ers. There’s no eth­i­cal lan­guage which makes sense from the ‘top-down’ or from a global per­spec­tive. But there’s also noth­ing to pre­vent (this is Eliezer’s mean­ing, I guess) a non-global per­spec­tive from be­ing worked out in which eth­i­cal lan­guage does make sense. And this per­spec­tive isn’t ar­bi­trary, be­cause the sub­sys­tems work­ing it out have always oc­cu­pied that per­spec­tive as sub­sys­tems. To see an al­gorithm from the in­side is to see world as a whole by see­ing it as po­ten­tially in­volved in this al­gorithm. And this is what leads to the con­fu­sion be­tween the global, time­less view from the (no less global, in some sense) time­ful in­side-an-al­gorithm view.

If that’s all pass­ably nor­mal (as skep­ti­cal as I am at the co­her­ence of the idea of ‘adding up to nor­mal­ity’) then the ques­tion that re­mains is what I should do with my idea of things mat­ter­ing eth­i­cally. Maybe the an­swer here is to see eth­i­cal agents as on­tolog­i­cally fun­da­men­tal or some­thing, though that sounds dan­ger­ously an­thro­pocen­tric. But I don’t know how to jus­tify the idea that phys­i­cally-fun­da­men­tal = on­tolog­i­cally-fun­da­men­tal ei­ther.

• Would you be will­ing to com­mit to an a pri­ori eth­i­cal prin­ci­ple such that ought im­plies can?

I’m not Viliam Bur, but I wouldn’t quite agree with this, in that time mat­ters. It’s not in­co­her­ent to talk about a sys­tem that can’t do X, could have done X, and ought to have done X, for ex­am­ple. It’s similarly not in­co­her­ent to talk about a sys­tem that can’t do X now but ought to have acted in the past so as to be able to do X now.

But yes, in gen­eral I would say the pur­pose of ethics is to de­ter­mine right ac­tion. If we’re talk­ing about the eth­i­cal sta­tus of a sys­tem with re­spect to ac­tions we are vir­tu­ally cer­tain the sys­tem could not have taken, can not take, and will not be able to take, then we’re no longer talk­ing about ethics in any straight­for­ward sense.

• Okay, so let’s adopt ‘ought im­plies can’ then, and re­strict it to the same tense: if I ought to do X, I can do X. If I could have done (but can no longer do) X, then I ought to have done (but no longer ought to do) X.

How does this, in con­nec­tion with MW, in­ter­act with con­se­quen­tial­ism? The con­se­quences of my ac­tions can’t de­ter­mine how much mur­der­ing I do (in the big world sense), just whether or not I fall on a mur­der-path. In the big world sense, I can’t (and there­fore ought not) change the num­ber of mur­der-paths. The con­se­quence at which I should aim is the na­ture of the path I in­habit, be­cause that’s what I can change.

Maybe this is right, but if it is, it seems to me to be an oddly sub­jec­tive form of con­se­quen­tial­ism. I’m not sure if this cap­tures my thought, but it seems that it’s not as if I’m mak­ing the world a bet­ter place, I’m just putting my­self in a bet­ter world.

• it seems that it’s not as if I’m mak­ing the world a bet­ter place, I’m just putting my­self in a bet­ter world.

It seems like you are not mak­ing world a bet­ter place be­cause you think about fixed prob­a­bil­ity of be­com­ing a mur­derer, which your de­ci­sions can­not change. But the prob­a­bil­ity of you be­com­ing a mur­derer is a re­sult of your de­ci­sions.

You have re­versed the causal­ity, be­cause you imag­ine the prob­a­bil­ity of you ever be­ing a mur­derer as some­thing that ex­isted sooner, and your de­ci­sions about mur­der­ing as some­thing that hap­pens later.

You treat prob­a­bil­ity of some­thing hap­pen­ing in fu­ture as a fact that hap­pened in the past. (Which is a com­mon er­ror. When hu­mans talk about “out­side of time”, they always imag­ine it in the past. No, the past is not out­side of time; it is a part of time.)

• The con­se­quences of my ac­tions can’t de­ter­mine how much mur­der­ing I do (in the big world sense), [...] the na­ture of the path I in­habit, be­cause that’s what I can change.

I’m not at all con­vinced that I en­dorse what you are do­ing with the word “I” here.

If we want to say that there ex­ists some en­tity I, such that I com­mit mur­ders on mul­ti­ple branches, then to also talk about “the na­ture of the path I in­habit” seems en­tirely in­co­her­ent. There is no sin­gle path I in­habit, I (as defined here) in­hab­its all paths.

Con­versely, if we want to say that there ex­ists a sin­gle path that I in­habit (a much more con­ven­tional way of speak­ing), then mur­ders com­mit­ted on other branches are not mur­ders I com­mit.

I’m not sure if that af­fects your point or not, but I have trou­ble re­fac­tor­ing your point to elimi­nate that con­fu­sion, so it seems rele­vant.

• If we want to say that there ex­ists some en­tity I, such that I com­mit mur­ders on mul­ti­ple branches, then to also talk about “the na­ture of the path I in­habit” seems en­tirely in­co­her­ent. There is no sin­gle path I in­habit, I (as defined here) in­hab­its all paths.

True, good point. That seems to be salt on the wound though. What I meant by ‘I’ is this: say I’m in path A. I have a par­allel ‘I’ in path B if the con­figu­ra­tion of some­thing in B is such that, were it in A at some time past or fu­ture, I would con­sider it to be a (per­haps sur­pris­ing) con­tinu­a­tion of my ex­is­tence in A.

If the Ai and the Bi are the same per­son, then I’m eth­i­cally re­spon­si­ble for the be­hav­ior of Bi for the same rea­sons I’m eth­i­cally re­spon­si­ble for my­self (Ai). If Ai and Bi are not the same per­son (even if they’re very similar peo­ple) then I’m not re­spon­si­ble for Bi at all, but I’m also no longer de-co­her­ent: there is always only one world with me in it. I take it nei­ther of these op­tions is true, and that some mid­dle ground is to be preferred: Bi is not the same per­son as me, but some­thing like a coun­ter­part. Am I not re­spon­si­ble for the ac­tions of my coun­ter­parts?

That’s a hard ques­tion to an­swer, but say I get up­loaded and copied a bunch of times. A year later, some large per­centage of my copies have be­come se­rial kil­lers, while oth­ers have not. Are the peace­ful copies morally re­spon­si­ble for the se­rial kil­ling? If we say ‘no’ then it seems like we’re com­mit­ted to at least some kind of liber­tar­i­anism as re­gards free will. I un­der­stood the com­pat­i­bil­ist view around here to be that you are re­spon­si­ble for your ac­tions by way of be­ing con­sti­tuted in such and such a way. But my peace­ful copies are con­sti­tuted in largely the same way as the kil­ler copies are. We only count them as nu­mer­i­cally differ­ent on the ba­sis of seem­ingly triv­ial dis­tinc­tions like the fact that they’re em­bod­ied in differ­ent hard­ware.

• What I meant by ‘I’ is this: say I’m in path A. I have a par­allel ‘I’ in path B if the con­figu­ra­tion of some­thing in B is such that, were it in A at some time past or fu­ture, I would con­sider it to be a (per­haps sur­pris­ing) con­tinu­a­tion of my ex­is­tence in A.

Well, OK. We are, of course, free to con­sider any en­tity we like an ex­ten­sion of our own iden­tity in the sense you de­scribe here. (I might similarly con­sider some other en­tity in my own path to be a “par­allel me” if I wish. Heck, I might con­sider you a par­allel me.)

If the Ai and the Bi are the same per­son, then I’m eth­i­cally re­spon­si­ble for the be­hav­ior of Bi for the same rea­sons I’m eth­i­cally re­spon­si­ble for my­self (Ai).

It is not at all clear that I know what the rea­sons are that I’m eth­i­cally re­spon­si­ble for my­self, if I am the sort of com­plex mostly-ig­no­rant-of-its-own-ac­tivi­ties en­tity scat­tered across mul­ti­ple branches that you are posit­ing I am. Again, trans­plant­ing an eth­i­cal in­tu­ition (like “I am eth­i­cally re­spon­si­ble for my ac­tions”) un­ex­am­ined from one con­text to a vastly differ­ent one is rarely jus­tified.

So a good place to start might be to ask why I’m eth­i­cally re­spon­si­ble for my­self, and why it mat­ters.

I take it nei­ther of these op­tions is true, and that some mid­dle ground is to be preferred: Bi is not the same per­son as me, but some­thing like a coun­ter­part.

Can you say more about that prefer­ence? I don’t share it, my­self. I would say, rather, that I have some de­gree of con­fi­dence in the claim “Ai and Bi are the same per­son” and some de­gree of con­fi­dence that “Ai and Bi are differ­ent peo­ple,” and that mul­ti­ple ob­servers can have differ­ent de­grees of con­fi­dence in these claims about a given (Ai, Bi) pair, and there’s no fact of the mat­ter.

say I get up­loaded and copied a bunch of times. A year later, some large per­centage of my copies have be­come se­rial kil­lers, while oth­ers have not. Are the peace­ful copies morally re­spon­si­ble for the se­rial kil­ling?

Say I be­long to a group of dis­tinct in­di­vi­d­u­als, who are born and raised in the usual way, with no copy­ing in­volved. A year later, some large per­centage of the in­di­vi­d­u­als in my group be­come se­rial kil­lers, while oth­ers do not. Are the peace­ful in­di­vi­d­u­als morally re­spon­si­ble for the se­rial kil­ling?

Al­most all of the rele­vant fac­tors gov­ern­ing my an­swer to your ex­am­ple seem to ap­ply to mine as well. (My own an­swer to both ques­tions is “Yes, within limits,” those limits largely be­ing a func­tion of the de­gree to which ob­ser­va­tions of Ai can serve as ev­i­dence about Bi.

• But it seems to me that whether or not I mur­der some­one, the to­tal num­ber of mur­der-paths and the to­tal num­ber of non-mur­der-paths is the same? Is this to­tally off base? I hope that it is.

Good news! It is to­tally off base. There is noth­ing in quan­tum me­chan­ics re­quiring that the num­ber of branches cor­re­spond­ing to an ar­bi­trary macro­scopic event and its nega­tion must be equal.

• There is noth­ing in quan­tum me­chan­ics re­quiring that the num­ber of branches cor­re­spond­ing to an ar­bi­trary macro­scopic event and its nega­tion must be equal.

Aww, you had my hopes up. There’s noth­ing in my set-up that re­quires them to be equal ei­ther, just that the num­bers be fixed.

• So, fine, live in my world and don’t worry about the oth­ers. But whence that rule? That seems ar­bi­trary

That feel­ing of ar­bi­trari­ness is, IMHO, worth ex­plor­ing more care­fully.

Sup­pose, for ex­am­ple, it turns out that we don’t live in a Big World… that this is all there is, and that events ei­ther hap­pen in this world or they don’t hap­pen at all. Sup­pose you some­how were to re­ceive con­fir­ma­tion of this. Big re­lief, right? Now you re­ally can re­duce the to­tal amount of what­ever in all of ex­is­tence ev­ery­where, so ac­tions have mean­ing again.

But then you meet some­one who says “But what about hy­po­thet­i­cal peo­ple? No mat­ter how many peo­ple I don’t ac­tu­ally mur­der, there’s still countless hy­po­thet­i­cal peo­ple be­ing hy­po­thet­i­cally mur­dered! And, sure, you can tell me to just worry about ac­tual peo­ple and don’t worry about the other, but whence that rule? It seems ar­bi­trary.”

Would you find their po­si­tion rea­son­able?
What would you say to them, if not?

• But then you meet some­one who says “But what about hy­po­thet­i­cal peo­ple? No mat­ter how many peo­ple I don’t ac­tu­ally mur­der, there’s still countless hy­po­thet­i­cal peo­ple be­ing hy­po­thet­i­cally mur­dered! And, sure, you can tell me to just worry about ac­tual peo­ple and don’t worry about the other, but whence that rule? It seems ar­bi­trary.”

Well put. This ac­tu­ally does come up in a philo­soph­i­cal view known as modal re­al­ism. Roughly, if we can make true or false claims about pos­si­ble wor­lds, then those wor­lds must be ac­tual in or­der to be truth-mak­ers. So all pos­si­ble wor­lds are ac­tual.

If my some­one said what you said he said, sup­pose I ask this in re­ply:

E:”Wait, are those hy­po­thet­i­cal peo­ple be­ing hy­po­thet­i­cally mur­dered? Is that true?”

S: “Yes! And there’s noth­ing you can do!”

E:”And there’s some re­al­ity to which this part of the map, the hy­po­thet­i­cal-peo­ple-be­ing-mur­dered cor­re­sponds? Such that the hy­po­thet­i­cal mur­der of these peo­ple is a real part of our world?”

S: “Well, sure.”

E: “Okay, well if we’re go­ing to ven­ture into modal re­al­ism then this just con­flicts in the same way.”

S: Sup­pose we’re not modal re­al­ists then. Sup­pose there’s just not re­ally a fact of the mat­ter about whether or not hy­po­thet­i­cal, and there­fore non-ex­is­tant peo­ple are be­ing mur­dered.

E: No prob­lem. I’m just in­ter­ested in re­duc­ing real evils.

S: Isn’t that an ar­bi­trary de­ter­mi­na­tion?

E: No, it’s the ex­act op­po­site of ar­bi­trary. I also don’t take non-ex­is­tant ev­i­dence as ev­i­dence, I don’t eat non-ex­is­tant fruit, etc. If we call this ar­bi­trary, then what isn’t?

• I would cer­tainly say you’re jus­tified in not car­ing about hy­po­thet­i­cal mur­ders. I would also say you’re jus­tified in not car­ing about mur­ders in other MW branches.

What you seem to want to say here is that be­cause mur­ders in other MW branches are “ac­tual”, you care about them, but since mur­ders in my imag­i­na­tion are not “ac­tual”, you don’t.

I have no idea what the word “ac­tual” could pos­si­bly re­fer to so as to do the work you want it to do here.

There are cer­tainly clusters of con­sis­tent ex­pe­rience to which a hy­po­thet­i­cal mur­der of a hy­po­thet­i­cal per­son cor­re­sponds. Those clusters might, for ex­am­ple, take the form of cer­tain pat­terns of neu­ral ac­ti­va­tion in my brain… that’s how I usu­ally model it, any­way. I’m happy to say that those are “ac­tual” pat­terns of neu­ral ac­ti­va­tion. I would not say that they are “ac­tual” mur­dered hu­man be­ings.

That said, I’m not re­ally sure it mat­ters if they are. I mean, if they are, then… hold on, let me vi­su­al­ize… there: I just “ac­tu­ally” re­s­ur­rected them and they are now “ac­tu­ally” ex­tremely happy. Was their former mur­der still evil? At best, it seems all of my pre­con­ceived no­tions about mur­der (e.g., that it’s a per­ma­nent state change of some kind) have just been thrown out the win­dow, and I should give some se­ri­ous thought to why I think mur­der is evil in the first place.

It seems some­thing similar is true about ex­is­tence in a Big World… if I want to in­cor­po­rate that into my think­ing, it seems I ought to re­think all of my as­sump­tions. Trans­plant­ing a moral in­tu­ition about mur­der de­rived in a small world into a big world with­out any al­ter­a­tion seems like a recipe for walk­ing off con­cep­tual cliffs.

• What you seem to want to say here is that be­cause mur­ders in other MW branches are “ac­tual”, you care about them, but since mur­ders in my imag­i­na­tion are not “ac­tual”, you don’t.

Right, ex­actly. I’m tak­ing this sense of ‘ac­tual’ (not liter­ally) from the se­quences. This is from ‘On be­ing De­co­her­ent’:

You only see nearby ob­jects, not ob­jects light-years away, be­cause pho­tons from those ob­jects can’t reach you, there­fore you can’t see them. By a similar lo­cal­ity prin­ci­ple, you don’t in­ter­act with dis­tant con­figu­ra­tions.

Later on in this post EY says that the Big World is already at is­sue in spa­tial terms: some­where far away, there is an­other Esar (or some­one enough like me to count as me). The im­pli­ca­tion is that ex­ist­ing in an­other world is analo­gous to ex­ist­ing in an­other place. And I cer­tainly don’t think I’m al­lowed to ap­ply the ‘keep your own cor­ner clean’ prin­ci­ple to spa­tial zones.

In ’Liv­ing in Many Wor­lds”, EY says:

“Oh, there are a few im­pli­ca­tions of many-wor­lds for ethics. Aver­age util­i­tar­i­anism sud­denly looks a lot more at­trac­tive—you don’t need to worry about cre­at­ing as many peo­ple as pos­si­ble, be­cause there are already plenty of peo­ple ex­plor­ing per­son-space. You just want the av­er­age qual­ity of life to be as high as pos­si­ble, in the fu­ture wor­lds that are your re­spon­si­bil­ity.

And you should always take joy in dis­cov­ery, as long as you per­son­ally don’t know a thing. It is mean­ingless to talk of be­ing the “first” or the “only” per­son to know a thing, when ev­ery­thing know­able is known within wor­lds that are in nei­ther your past nor your fu­ture, and are nei­ther be­fore or af­ter you.”

I take him to mean that there are re­ally, ac­tu­ally many other peo­ple who ex­ist (just in differ­ent wor­lds) and that I’m re­spon­si­ble for the qual­ity of life for some sub-set of those peo­ple. And that there re­ally are, ac­tu­ally, many peo­ple in other wor­lds who have dis­cov­ered or know things I might take my­self to have dis­cov­ered or be the first to know. Such that it’s a small but real over­turn­ing of nor­mal­ity that I can’t re­ally be the first to know some­thing. (That, I as­sume, is what an eth­i­cal im­pli­ca­tion of MW for ethics amounts to, some over­turn­ing of some eth­i­cal nor­mal­ity).

I’m happy to say that those are “ac­tual” pat­terns of neu­ral ac­ti­va­tion. I would not say that they are “ac­tual” mur­dered hu­man be­ings.

If you mod­eled it to the point that you fully mod­eled a hu­man be­ing in your brain, and then mur­dered them, it seems ob­vi­ous that you did ac­tu­ally kill some­one. Hy­po­thet­i­cal mur­ders (but con­sid­ered) fail to be mur­ders be­cause they fail to be good enough mod­els.

Was their former mur­der still evil?

Yes...ob­vi­ously!

• Was their former mur­der still evil?

Yes...ob­vi­ously!

Or­di­nar­ily, I would de­scribe some­one who is un­cer­tain about ob­vi­ous things as a fool. It’s not clear to me that I’m a fool, but it is also not at all clear to me that mur­der as you’ve defined it in this con­ver­sa­tion is evil.

If you could ex­plain that ob­vi­ous truth to me, I might learn some­thing.

• Or­di­nar­ily, I would de­scribe some­one who is un­cer­tain about ob­vi­ous things as a fool. It’s not clear to me that I’m a fool, but it is also not at all clear to me that mur­der as you’ve defined it in this con­ver­sa­tion is evil.

I didn’t mean to call you a fool, only I don’t think the dis­rup­tion of your in­tu­itions is a dis­rup­tion of your eth­i­cal in­tu­itions. It’s un­in­tu­itive to think of a hu­man-be­ing as some­thing fully em­u­lated within an­other hu­man be­ing’s brain, but if this is ac­tu­ally pos­si­ble, it’s not un­in­tu­itive that end­ing this neu­ral ac­tivity would be mur­der (if it weren’t some other form of kil­ling-a-hu­man-be­ing). My point was just that the dis­tinc­tion in hard­ware can’t make a differ­ence to the ques­tion of whether or not end­ing a neu­ral ac­tivity is kil­ling, and given a set of con­stants, mur­der.

Since I don’t think we’re any longer talk­ing about my origi­nal ques­tion, I think I’ll tap out.

• You’re de­ci­sions aren’t ran­dom! If you de­cide to do some­thing then the vast ma­jor­ity of other selves you have will de­cide the same thing. When you do good you do in­deed do good in all uni­verses branch­ing from this one.

(But what if what I just said wasn’t the case? Would you let your sense of ethics over­ride the phys­i­cal ev­i­dence? Look at the causal his­tory of your moral­ity: it comes from evolu­tion. Do you think that if MW was true then evolu­tion would be forced to hap­pen differ­ently, in or­der to give you differ­ent morals?

• But what if what I just said wasn’t the case? Would you let your sense of ethics over­ride the phys­i­cal ev­i­dence?

This is a good ques­tion, but I think it’s im­por­tant to un­der­stand that it’s a good ques­tion. Ev­i­dence from the phys­i­cal sci­ences doesn’t have some fixed pri­or­ity over other kinds of ev­i­dence. One could ar­gue that its an un­usu­ally good source of ev­i­dence, of course, but I’m not sure how to make the com­par­i­son in this case.

• why aren’t MWI and ethics just flatly in con­flict?

• How do you know it all adds up to nor­mal­ity? What should I an­ti­ci­pate if it does, and what should I an­ti­ci­pate if it doesn’t? Or is this an a pri­ori prin­ci­ple?

• When Ein­stein over­threw the New­to­nian ver­sion of grav­ity, ap­ples didn’t stop fal­ling, planets didn’t swerve into the Sun. Every new the­ory of physics must cap­ture the suc­cess­ful pre­dic­tions of the old the­ory it dis­placed; it should pre­dict that the sky will be blue, rather than green.

So don’t think that many-wor­lds is there to make strange, rad­i­cal, ex­cit­ing pre­dic­tions. It all adds up to nor­mal­ity.

Which means that your ethics should not de­pends on the po­ten­tial ex­is­tence of other wor­lds we have no way of in­ter­act­ing with. In other words, while it might well be sim­pler (for some peo­ple) to rea­son your ethics by us­ing the many wor­lds paradigm, the out­come of this rea­son­ing should not de­pend on the num­ber of wor­lds.

• So, I’ve been think­ing about this, and say I and ev­ery­one I know be­lieves that it’s pos­si­ble to be the first one, ab­solutely, to whis­tle a tune. This is, for our strange cul­ture, an im­por­tant eth­i­cal be­lief. That be­lief is part of what I would call ‘nor­mal­ity’. Now, some jerk comes a long and proves MW, and so I learn that for any tune I would con­sider novel, odds are that it’s been whis­tled be­fore in an­other world (I’m tak­ing this ex­am­ple from EY in the se­quences). So, de­pend­ing on my nor­mal, MW may add up to nor­mal­ity, and it may not. In a much more ob­vi­ous sense, if my nor­mal is New­to­nian physics, MW doesn’t add up to nor­mal­ity ei­ther.

So what does adding up to nor­mal mean? Con­sider that my other stupid ques­tion. Egan’s law seems to go un-ar­gued for and un­ex­plained. If it just means what the para­graph you cite says, then MW may well abol­ish or come into con­flict with our eth­i­cal ideas, since ap­par­ently it comes into con­flict with all kinds of other ideas (like false phys­i­cal the­o­ries) and none of this re­quires the de­struc­tion of the so­lar sys­tem or fly­ing ap­ples.

• So what does adding up to nor­mal mean?

It means that if you do not ob­serve pink uni­corns daily, no new weird and won­der­ful the­ory should claim that you should have. Or, as EY puts it “ap­ples didn’t stop fal­ling, planets didn’t swerve into the Sun”. Another name for this is the cor­re­spon­dence prin­ci­ple.

If your ethics re­quires for you to be the first tune whistler in the mul­ti­verse, not just in this world, it’s not a use­ful ethics.

• If your ethics re­quires for you to be the first tune whistler in the mul­ti­verse, not just in this world, it’s not a use­ful ethics.

The use­ful­ness of the ethics (if that’s the right stan­dard to ap­ply to an eth­i­cal idea) is not rele­vant to the ex­am­ple.

That is, un­less you want to posit (and we should be su­per, su­per clear about this) that there is an a pri­ori prin­ci­ple that any ethics ca­pa­ble of be­ing con­tra­dicted by a true phys­i­cal the­ory is not use­ful. But I very much doubt you want to say that.

I think mod­ern physics pretty ob­vi­ously doesn’t add up to nor­mal­ity in a num­ber of cases. Long de­bates about cry­on­ics took place be­cause part of many peo­ple’s nor­mal un­der­stand­ing of per­sonal iden­tity (an eth­i­cal cat­e­gory if there ever was one) in­volved a con­cep­tion of ma­te­rial con­sti­tu­ants like atoms such that there can be my atoms ver­sus your atoms. This just turned out to be non­sense, as we dis­cov­ered through in­ves­ti­ga­tion of physics. The fact that atoms no more have iden­tities qua par­tic­u­lar in­stances than do num­bers over­turned some el­e­ment of nor­mal­ity.

Given cases like that, how does one ac­tu­ally ar­gue for Egan’s law? It’s not enough to just state it.

• So what does adding up to nor­mal mean?

It means that if in your branch you are the first one to whis­tle the tune, there is no one else in your branch to con­tra­dict you. (Just as you would ex­pect in One World.) In some other branch some­one else was first, and in that branch you don’t think that you were the first, so again no con­flict.

if my nor­mal is New­to­nian physics

Then “adding up to nor­mal” means that even when Ein­stein ru­ins your model, all things will be­have the same way as they always did. Things that within given pre­ci­sion obeyed the New­to­nian physics, will con­tinue to do it. You will only see ex­cep­tions in un­usual situ­a­tions, such as GPS satel­lites. (But if you had GPS satel­lites be­fore Ein­stein in­vented his the­ory, you would have seen those ex­cep­tions too. You just didn’t know that would hap­pen.)

In case of moral­ity it means that if you had a rule “X is good” be­cause it usu­ally has good con­se­quences (or be­cause it fol­lows the rules, or what­ever), then “X is good” even with Many Wor­lds. The ex­cep­tion is if you try to ap­ply moral sig­nifi­cance to a pho­ton mov­ing through a dou­ble slit.

An ex­pla­na­tion may change: for ex­am­ple it was im­moral to say “if the coin ends this side up, I will kill you”, and it is still im­moral to do so, but the pre­vi­ous ex­pla­na­tion was that “it is bad to kill peo­ple with 50% prob­a­bil­ity” and the new ex­pla­na­tion is “it is bad to kill peo­ple in 50% of branches” (which means kil­ling them with 50% prob­a­bil­ity in a ran­dom branch).

• Okay, so on re­flec­tion, I think the idea that it all adds up to nor­mal­ity is just junk. It doesn’t mean any­thing. I’ll try to ex­plain:

A: MW comes into con­flict with this eth­i­cal prin­ci­ple.

B: It can’t come into con­flict. Physics always adds up to nor­mal­ity.

A: Really? Sup­pose I see an ap­ple fal­ling, and you dis­cover that there’s no such thing as an ap­ple, but that what we called ap­ples are ac­tu­ally a sub-species of blue­ber­ries. Now I’ve learned that I’ve in fact never seen an ap­ple fall, since by ‘ap­ple’ I meant the fruit of an in­de­pen­dent species of plant. So, nor­mal­ity over­turned.

B: No, that’s not an over­turn­ing of nor­mal­ity, that’s just a change of ex­pla­na­tion. What you saw was this green­ish round thing fal­ling, and you ex­plained this as an ‘ap­ple’. Now your ex­pla­na­tion is differ­ent, but the thing you ob­served is the same.

A: Ah, but lets say sci­ence dis­cov­ers that the green round thing I saw isn’t green at all. In fact, green is just the color that bounces off the thing. If it’s any color, it’s the color of the wave­lengths of light it ab­sorbs. Nor­mal­ity over­turned.

B: But that’s just what be­ing ‘green’ now means. What you saw was some light your re­cep­tors in way that varied over time, and you ex­plained this as a green thing mov­ing. The ob­ser­va­tion, the light hit­ting your eye over time, is the same. The ex­pla­na­tion has shifted.

A: Now say that it turns out that (bear with me) there is no mo­tion or time. What I thought was some light hit­ting my retina over time is just my own brain co-evolv­ing with a broader wave-func­tion. Now that’s over­turn­ing nor­mal­ity.

B: No, what you ex­pe­rienced qual­i­ta­tively is the same, but the ex­pla­na­tion has changed.

A: What did I ex­pe­rience qual­i­ta­tively?

B: If you’re will­ing to go into plau­si­ble but hy­po­thet­i­cal dis­cov­er­ies, I can’t give it any de­scrip­tion that is ba­sic enough that it can’t be ‘over­turned’. Even ‘ex­pe­rience’ is prob­a­bly over­turn­able.

A: That’s why ‘it all adds up to nor­mal­ity’ is junk. By that stan­dard, noth­ing is nor­mal. If any­thing I can de­scribe as a phe­nomenon is nor­mal, then it can be over­turned un­der that de­scrip­tion.

• So my stupid ques­tion is this: why aren’t MWI and ethics just flatly in con­flict?

This ques­tion used to worry me a lot too, and at one point I also con­sid­ered the idea that we can’t “change the fun­da­men­tal amount of good­ness” but just choose a path through the branch­ing wor­lds.

The view that’s cur­rently preva­lent among LWers who study de­ci­sion the­ory is that you should think of your­self as be­ing able to change math­e­mat­i­cal facts, be­cause de­ci­sions are them­selves math­e­mat­i­cal facts and by mak­ing de­ci­sions you de­ter­mine other math­e­mat­i­cal facts via log­i­cal im­pli­ca­tion. So for ex­am­ple the amount of good­ness in a de­ter­minis­tic uni­verse like MWI, given some ini­tial con­di­tions, is a math­e­mat­i­cal fact that you can change through your de­ci­sions.

• The is­sue is that the MWI does not ad­dress the phe­nomenon of sin­gle path be­ing em­piri­cally spe­cial (your path). The the­o­ries as in the code that you would have when you use Solomonoff in­duc­tion on your sen­sory in­put, have to ad­dress this phe­nomenon—they pre­dict (or guess) sen­sory in­put not pro­duce some­thing which merely con­tains sen­sory in­put some­where in the mid­dle of enor­mous stream of al­ter­na­tives. [putting aside for the mo­ment that Solomonoff in­duc­tion with Tur­ing ma­chine would have trou­bles with ro­ta­tional and other sym­me­tries]

That is true of physics in gen­eral—it is by de­sign is con­cerned with pre­dict­ing our sen­sory in­put NOT ‘ex­plain­ing it away’ by pro­duc­ing an enor­mous body of things within which the in­put can be found, and this is why MWI, the way it is now, is seen as un­satis­fac­tory, and why hav­ing the un-phys­i­cal col­lapse of CI is ac­cept­able. The goal is to guess the sen­sory in­put the best, and thus choice of path—even if made ran­domly—has to be part of the­ory.

Fur­ther­more, if one is to seek the short­est ‘ex­plana­tory’ the­ory which con­tains you and your in­put some­where within it, but doesn’t have to in­clude the ‘guess where you are’ part, the MWI is not the win­ner, a pro­gram that iter­ates over all the­o­ries of physics and simu­lates them, is—you get other sort of mul­ti­verse.

edit: On a more gen­eral note, one shouldn’t be con­vinced sim­ply be­cause one can’t see a sim­pler al­ter­na­tive. It’s very hard to see al­ter­na­tives in physics. Here is a good ar­ti­cle about the is­sue.

• How would I build a web­site that worked the same way as Less­wrong?

• I don’t know the de­tails, but LessWrong is forked from the Red­dit source­code—do­ing some­thing similar might be a good start.

In more gen­eral terms, you need to learn to pro­gram, and write a web­server pro­gram, gen­er­ally by us­ing a Web Ap­pli­ca­tion Frame­work, and put that web­server pro­gram on a server that you ei­ther rent from any num­ber of places or setup your­self. A rea­son­able way to do this (speak­ing from a small amount of ex­pe­rience mak­ing a site much less com­pli­cated than LessWrong) is to use Python with we­bapp2 and Google App Eng­ine.

Beyond that, you’re go­ing to have to be more spe­cific about your ex­pe­riences and goals. LessWrong is not a sim­ple web­site. Don’t ex­pect to be able to write from scratch any­thing of this mag­ni­tude in less than sev­eral man years (de­pend­ing on what you count as “from scratch”). Build­ing it by, say, us­ing an ex­ist­ing sys­tem such as WordPress would be much, much less work.

• You could use the LW code it­self.

• Isn’t it al­most cer­tain that su­per-op­ti­miz­ing AI will re­sult in un­in­tended con­se­quences? I think it’s al­most cer­tain that su­per-op­ti­miz­ing AI will have to deal with their own un­in­tended con­se­quences. Isn’t the ex­pec­ta­tion of en­coun­ter­ing in­tel­li­gence so ad­vanced, that it’s perfect and in­fal­lible es­sen­tially the ex­pec­ta­tion of en­coun­ter­ing God?

• Isn’t the ex­pec­ta­tion of en­coun­ter­ing in­tel­li­gence so ad­vanced, that it’s perfect and in­fal­lible es­sen­tially the ex­pec­ta­tion of en­coun­ter­ing God?

Which god? If by “God” you mean “some­thing es­sen­tially perfect and in­fal­lible,” then yes. If by “God” you mean “that en­tity that kil­led a bunch of Egyp­tian kids” or “that en­tity that’s re­spon­si­ble for light­ning” or “that guy that an­noyed the Ro­man em­pire 2 mil­len­nia ago,” then no.

Also, es­sen­tially in­fal­lible to us isn’t nec­es­sar­ily es­sen­tially in­fal­lible to it (though I sus­pect that any at­tempt at AGI will have enough hacks and short­cuts that we can see faults too).

• Which god? If by “God” you mean “some­thing es­sen­tially perfect and in­fal­lible,” then yes.

That one. Big man in sky in­vented by shep­herds does’t in­ter­est me much. Just be­cause I’m a bet­ter op­ti­mizer of re­sources in cer­tain con­texts than an amoeba doesn’t make me perfect and in­fal­lible. Just be­cause X is or­ders of mag­ni­tude a bet­ter op­ti­mizer than Y doesn’t make X perfect and in­fal­lible. Just be­cause X can rapidly op­ti­mize it­self doesn’t make it in­fal­lible ei­ther. Yet when peo­ple talk about the post-sin­gu­lar­ity su­per-op­ti­miz­ers, they seem to be talk­ing about some sort of Sci-Fi God.

• Y’know, I’m not re­ally sure where that idea comes from. The op­ti­miza­tion power of even a mod­er­ately tran­shu­man AI would be quite in­cred­ible, but I’ve never seen a con­vinc­ing ar­gu­ment that in­tel­li­gence scales with op­ti­miza­tion power (though the ar­gu­ment that op­ti­miza­tion power scales with in­tel­li­gence seems sound).

• but I’ve never seen a con­vinc­ing ar­gu­ment that in­tel­li­gence scales with op­ti­miza­tion power

“op­ti­miza­tion power” is more-or-less equiv­a­lent to “in­tel­li­gence”, in lo­cal par­lance. Do you have a differ­ent defi­ni­tion of in­tel­li­gence in mind?

• One that doesn’t clas­sify evolu­tion as in­tel­li­gent.

• So the non­ap­ples the­ory of in­tel­li­gence, then?

• More gen­er­ally, a the­ory that re­quires mod­el­ing of the fu­ture for some­thing to be in­tel­li­gent.

• What’s un­in­tended con­se­quences? An im­perfect abil­ity to pre­dict the fu­ture? Read strictly, any finite en­tity’s abil­ity to pre­dict the fu­ture is go­ing to be im­perfect.

• What if the AI are ad­vanced over us as we are over cock­roaches, and the su­per­in­tel­li­gent AI find us just as an­noy­ing, dis­gust­ing, and hard to kill?

• What rea­son is there to ex­pect such a thing?

(Not to men­tion that, proverbs notwith­stand­ing, hu­mans can and do kill cock­roaches eas­ily; I wouldn’t want the ta­bles to be re­versed.)

• Rea­son: Cock­roaches and the be­hav­ior of hu­mans. We can and do kill in­di­vi­d­u­als and spe­cific groups of in­di­vi­d­u­als. We can’t kill all of them, how­ever. If hu­mans can get into space, the light­speed bar­rier might let far-flung tribes of “hu­man fun­da­men­tal­ists,” to bor­row a term from Charles Stross, to sur­vive, though in­di­vi­d­u­als would of­ten be kil­led and would never stand a chance in a di­rect con­flict with a su­per AI.

• Cock­roaches and the be­hav­ior of hu­mans.

In it­self that doesn’t seem to be rele­vant ev­i­dence. “There ex­ist species that hu­mans can­not erad­i­cate with­out ma­jor co­or­di­nated effort”. It doesn’t fol­low that ei­ther the same would hold for far more pow­er­ful AIs, nor that we should model AI-hu­man re­la­tion­ship on hu­mans-cock­roaches rather than hu­mans-kit­tens or hu­mans-smal­l­pox.

If hu­mans can get into space, the light­speed bar­rier might let far-flung tribes of “hu­man fun­da­men­tal­ists,” to bor­row a term from Charles Stross, to survive

It’s easy to imag­ine spe­cific sce­nar­ios, es­pe­cially when gen­er­al­iz­ing from fic­tional ev­i­dence. In fact we don’t have ev­i­dence suffi­cient to even raise any sce­nario as con­crete as yours to the level of aware­ness.

I could as eas­ily re­ply that AI that wanted to kill flee­ing hu­mans could do so by pow­er­ful enough di­rected lasers, which will over­take any STL ship. But this is a con­trived sce­nario. There re­ally is no rea­son to dis­cuss it speci­fi­cally. (For one thing, there’s still no ev­i­dence hu­man space coloniza­tion or even so­lar sys­tem coloniza­tion will hap­pen any­time soon. And un­like AI it’s not go­ing to hap­pen sud­denly, with­out lots of ad­vanced no­tice.)

• It’s easy to imag­ine spe­cific sce­nar­ios, es­pe­cially when gen­er­al­iz­ing from fic­tional ev­i­dence. In fact we don’t have ev­i­dence suffi­cient to even raise any sce­nario as con­crete as yours to the level of aware­ness. … I could as eas­ily re­ply that AI that wanted to kill flee­ing hu­mans could do so by pow­er­ful enough di­rected lasers, which will over­take any STL ship. But this is a con­trived sce­nario. There re­ally is no rea­son to dis­cuss it speci­fi­cally.

A sum­mary of your points is that: while con­ceiv­able, there’s no rea­son to think it’s at all likely. Ok. How about, “Be­cause it’s fun to think about?”

Ac­tu­ally, lasers might not be prac­ti­cal against ma­neu­ver­able tar­gets be­cause of the diffrac­tion limit and the light­speed limit. In or­der to fo­cus a laser at very great dis­tances, one would need very large lenses. (Per­haps planet sized, de­pend­ing on dis­tance and fre­quency.) Tar­gets could re­spond by mov­ing out of the beam, and the light­speed limit would pre­clude im­me­di­ate re­tar­get­ing. Com­pen­sat­ing for this by mak­ing the beam wider would be very ex­pen­sive.

• Re­gard­ing lasers: I could list things the at­tack­ers might do to suc­ceed. But I don’t want to dis­cuss it be­cause we’d be spec­u­lat­ing on prac­ti­cally zero ev­i­dence. I’ll merely say that I would rather that my hopes for the fu­ture do not de­pend on a failure of imag­i­na­tion on part of an en­emy su­per­in­tel­li­gent AI.

• You’re as­sum­ing that there’s always an an­swer for the more in­tel­li­gent ac­tor. Only hap­pens that way in the movies. Some­times you get the bear, and some­times the bear gets you.

Some­times one can pin their hopes on the laws of physics in the face of a more in­tel­li­gent foe.

• It’s more fun to me to think about pleas­ant ex­tremely im­prob­a­ble fu­tures than un­pleas­ant ones. To each their own.

• There’s lots of scope for great ad­ven­ture sto­ries in dystopian fu­tures.

• Does any­one else find weapons fas­ci­nat­ing? Swords, guns, maces and axes?

I re­ally want a set of fully func­tional weapons, as ob­jects of art and power. Costly, though.

How much do other peo­ple spend on stuff they hang on the wall and oc­ca­sion­ally take down to ad­mire?

• I’ve had this sort of im­pulse be­fore.

Lately I try to min­i­mize “stuff to hang on the wall and oc­ca­sion­ally take down to ad­mire”. It brings me lit­tle benefit with all sorts of strange hid­den costs. But I tend to­wards clut­ter so that may just be in my case. And my wife is an artist, so I ex­pect to have more ob­jects-to-be-ad­mired than I know what to do with, with­out spend­ing money on some.

• I’m near the end of a ma­jor de­clut­ter­ing phase, which has opened up a lot of room in my small apart­ment, and shows off my rather spar­tan walls.

Also, I made a list. That seems to be the pri­mary pre­cur­sor to buy­ing lots of things: mak­ing a list, rank­ing and com­par­ing items. It hap­pened with a few of my pre­vi­ous pro­jects, and I’m already halfway through the new list, though I man­aged to keep my­self to the cheap and small stuff so far.

• I’ve read the ma­jor­ity of the Less Wrong ar­ti­cles on metaethics, and I’m still very very con­fused. Is this nor­mal or have I missed some­thing im­por­tant? Is there any sort of con­sen­sus on metaethics be­yond the rul­ing out of the very ob­vi­ously wrong?

• Your re­sponse is nor­mal—the metaethics se­quence is quite opaque, es­pe­cially com­pared to oth­ers, like the “Mys­te­ri­ous Ques­tions” se­quence.

I’m doubt­ful there is much con­sen­sus on the cor­rect metaethics in this com­mu­nity—anec­do­tal ev­i­dence is that there isn’t even con­sen­sus on the mean­ing of the tech­ni­cal vo­cab­u­lary we use. For an in­ter­me­di­ate look into some of the is­sues, I sug­gest the Stan­ford En­cy­clo­pe­dia of Philos­o­phy en­tries on moral re­al­ism and moral anti-realism

Also, I re­cently re­al­ized in dis­cus­sions that re­al­ism is a two place word, not a one place word. Thus, some in­fer­en­tial dis­tance in these types of dis­cus­sions is be­tween those use the la­bel “moral re­al­ism” to re­fer to re­al­ism(moral­ity, agent) and those who re­fer to re­al­ism(moral­ity, hu­man­ity).

• Thanks for this. I’m already aware of all of the defi­ni­tions you’ve men­tioned, and in fact I don’t like to use the word re­al­ism be­cause of the am­bi­guity.

Is there an ob­vi­ous next step, once you re­al­ise what the op­tions are and what the ques­tions are, or are there only the hard ques­tions left?

• It de­pends on what you think the next ques­tions are. Uncer­tainty about the truth of moral re­al­ism1, moral re­al­ism2, or anti-re­al­ism leads the in­quiry in one di­rec­tion. If one is satis­fied with the meta-eth­i­cal is­sue, ob­ject level moral ques­tions pre­dom­i­nate—and prob­a­bly must be ap­proached differ­ently de­pend­ing on one’s meta-eth­i­cal com­mit­ments.

If you are un­sure about what the next step is, I might recom­mend read­ing Ca­mus’ “The Stranger” and ex­am­in­ing what you think the main char­ac­ter is do­ing wrong—that should help you fo­cus your in­ter­est on ob­ject-level ethics or meta-ethics.

• Has Yud­kowsky fully writ­ten up ex­actly why he be­lieves what he be­lieves about AI? The dis­claimer at the top of this page trou­bles me some­what.

• Meta: One prob­lem with this thread is that it im­me­di­ately frames all ques­tions as “stupid”. I’m not sure ques­tions should be ap­proached from the per­spec­tive of “This point must be wrong since the Se­quences are right. How is this point wrong?” Some of the ques­tions might be in­sight­ful. Can we take “stupid” out of it?

• I think tak­ing the stupid out would make it worse. Mak­ing it a stupid ques­tions thread makes it a safe space to ask ques­tions that FEEL stupid to the asker.The point of this thread isn’t to en­able im­por­tant cri­tiques of the se­quences, it’s to make it eas­ier to ask ques­tions when they feel like ev­ery­one else already acts like they know the an­swer or some­thing. There can be other venues for ac­tual cri­tiques or se­ri­ous ques­tions about how ac­cu­rate the se­quences are.

• “Ba­sic ques­tions”, “back­ground ques­tions”, “sim­ple ques­tions” or even “Ex­plain Like I’m Five” would all get the point across with­out “stupid”.

• “Sim­ple ques­tion” makes you won­der whether your ques­tion is sim­ple.

• “Uh...stupid ques­tion?” is a fairly com­mon way of in­tro­duc­ing a ques­tion that doesn’t seem to have been ad­dressed, but to which ev­ery­one else seems to take the an­swer for granted. It doesn’t pre­sup­pose the cor­rect­ness of the ma­te­rial be­ing ques­tioned. It’s a diplo­matic way of say­ing “ei­ther you’ve not ex­plained this prop­erly, or I’ve missed some­thing”.

I would rather keep it as “stupid ques­tions” (a known turn of phrase refer­ring to this spe­cific cir­cum­stance, which is how it’s used), rather than some­thing more awk­ward yet marginally less am­bigu­ous.