# A Rationality Condition for CDT Is That It Equal EDT (Part 1)

[Epistemic Sta­tus: this se­ries of two posts gives some ar­gu­ments which, in my eyes, make it difficult to main­tain a po­si­tion other than CDT=EDT, but not im­pos­si­ble. As I ex­plain at the end of the sec­ond post, it is still quite ten­able to sup­pose that CDT and EDT end up tak­ing differ­ent ac­tions.]

Pre­vi­ously, I ar­gued that fair com­par­i­sons of CDT and EDT (in which the same prob­lem rep­re­sen­ta­tion is given to both de­ci­sion the­o­ries) will con­clude that CDT=EDT, un­der what I see as rea­son­able as­sump­tions. Re­cently, Paul Chris­ti­ano wrote a post ar­gu­ing that, all things con­sid­ered, the ev­i­dence strongly fa­vors EDT. Jes­sica Tay­lor pointed out that Paul didn’t ad­dress the prob­lem of con­di­tion­ing on prob­a­bil­ity zero events, but she came up with a novel way of ad­dress­ing that prob­lem by tak­ing the limit of small prob­a­bil­ities: COEDT.

Here, I provide fur­ther ar­gu­ments that ra­tio­nal­ity con­straints point in the di­rec­tion of COEDT-like solu­tions.

Note that I ar­gue for the con­clu­sion that CDT=EDT, which is some­what differ­ent from ar­gu­ing di­rectly for EDT; my line of rea­son­ing sug­gests some ad­di­tional struc­ture which could be missed by ad­vo­cat­ing EDT in iso­la­tion (or CDT in iso­la­tion). Paul’s post de­scribed CDT as a very spe­cial case of EDT, in which our ac­tion is in­de­pen­dent of other things we care about. This is true, but, we can also ac­cu­rately de­scribe EDT is a very spe­cial case of CDT where all prob­a­bil­is­tic re­la­tion­ships which re­main af­ter con­di­tion­ing on what we know turn out to also be causal re­la­tion­ships. I more of­ten think in the sec­ond way, be­cause CDT can have all sorts of coun­ter­fac­tu­als based on how cau­sa­tion works. EDT claims that these are only cor­rect when they agree with the con­di­tional prob­a­bil­ities.

(ETA: When I say “CDT”, I’m point­ing at some kind of steel-man of CDT which uses log­i­cal coun­ter­fac­tu­als rather than phys­i­cal coun­ter­fac­tu­als. TDT is a CDT in this sense, whereas UDT could be ei­ther CDT or EDT.)

This post will be full of con­jec­tural sketches, and mainly serves to con­vey my in­tu­itions about how COEDT could fit into the larger pic­ture.

# Hyper­real Probability

Ini­tially, think­ing about COEDT, I was con­cerned that al­though some­thing im­por­tant had been ac­com­plished, the con­struc­tion via limits didn’t seem fun­da­men­tal enough that it should be­long in our ba­sic no­tion of ra­tio­nal­ity. Then, I re­called how hy­per­real num­bers (which can be thought of as se­quences of real num­bers) are a nat­u­ral gen­er­al­iza­tion of de­ci­sion the­ory. This crops up in sev­eral differ­ent forms in differ­ent ar­eas of Bayesian foun­da­tions, but most crit­i­cally for the cur­rent dis­cus­sion, in the ques­tion of how to con­di­tion on prob­a­bil­ity zero events. Quot­ing an ear­lier post of mine:

In What Con­di­tional Prob­a­bil­ities Could Not Be, Alan Ha­jek ar­gues that con­di­tional prob­a­bil­ity can­not pos­si­bly be defined by Bayes’ fa­mous for­mula, due pri­mar­ily to its in­ad­e­quacy when con­di­tion­ing on events of prob­a­bil­ity zero. He also takes is­sue with other pro­posed defi­ni­tions, ar­gu­ing that con­di­tional prob­a­bil­ity should in­stead be taken as prim­i­tive.
The most pop­u­lar way of do­ing this are Pop­per’s ax­ioms of con­di­tional prob­a­bil­ity. In Learn­ing the Im­pos­si­ble (Vann McGee, 1994), it’s shown that con­di­tional prob­a­bil­ity func­tions fol­low­ing Pop­per’s ax­ioms and non­stan­dard-real prob­a­bil­ity func­tions with con­di­tion­als defined ac­cord­ing to Bayes’ the­o­rem are in­ter-trans­lat­able. Ha­jek doesn’t like the in­finites­i­mal ap­proach be­cause of the re­sult­ing non-unique­ness of rep­re­sen­ta­tion; but, for those who don’t see this as a prob­lem but who put some stock in Ha­jek’s other ar­gu­ments, this would be an­other point in fa­vor of in­finites­i­mal prob­a­bil­ity.

In other words, there is an ax­iom­a­ti­za­tion of prob­a­bil­ity—Pop­per’s ax­ioms—which takes con­di­tional prob­a­bil­ity to be fun­da­men­tal rather than de­rived. This ap­proach is rel­a­tively un­known out­side philos­o­phy, but of­ten ad­vo­cated by philoso­phers as a su­pe­rior no­tion of prob­a­bil­ity, largely be­cause it al­lows one to con­di­tion on prob­a­bil­ity zero events. Pop­per’s ax­ioms are in some sense equiv­a­lent to al­low­ing hy­per­real prob­a­bil­ities, which also means (with a lit­tle math­e­mat­i­cal hand-wav­ing; I haven’t worked this out in de­tail) we can think of them as a limit of a se­quence of strictly nonzero prob­a­bil­ity dis­tri­bu­tions.

All of this agrees nicely with Jes­sica’s ap­proach.

I take this to strongly sug­gest that rea­son­able ap­proaches to con­di­tion­ing on prob­a­bil­ity zero events in EDT will share the limit-like as­pect of Jes­sica’s ap­proach, even if it isn’t ob­vi­ous that they do. (Pop­per’s ax­ioms are “limit-like”, but this was prob­a­bly not ob­vi­ous to Pop­per.) The ma­jor con­tri­bu­tion of COEDT be­yond this is to provide a par­tic­u­lar way of con­struct­ing such limits.

(Hav­ing the idea “coun­ter­fac­tu­als should look like con­di­tion­als in hy­per­real prob­a­bil­ity dis­tri­bu­tions” is not enough to solve de­ci­sion the­ory prob­lems alone, since it is far from ob­vi­ous how we should con­struct hy­per­real prob­a­bil­ity dis­tri­bu­tions over logic to get rea­son­able log­i­cal coun­ter­fac­tu­als.)

# Hyper­real Bayes Nets & CDT=EDT

(The fol­low­ing ar­gu­ment is the only jus­tifi­ca­tion of the ti­tle of the post which will ap­pear in Part 1. I’ll have a differ­ent ar­gu­ment for the claim in the ti­tle in Part 2.)

The CDT=EDT ar­gu­ment can now be adapted to hy­per­real struc­tures. My origi­nal ar­gu­ment re­quired:

1. Prob­a­bil­ities & Causal Struc­ture are Com­pat­i­ble: The de­ci­sion prob­lem is given as a Bayes net, in­clud­ing an ac­tion node (for the ac­tual ac­tion taken by the agent) and a de­ci­sion node (for the mixed strat­egy the agent de­cides on). The CDT agent in­ter­prets this as a causal net, whereas the EDT agent ig­nores the causal in­for­ma­tion and treats it as a prob­a­bil­ity dis­tri­bu­tion.

2. Ex­plo­ra­tion: all ac­tion prob­a­bil­ities are bounded away from zero in the de­ci­sion; that is, the de­ci­sion node is re­stricted to mixed strate­gies in which each ac­tion gets some min­i­mal prob­a­bil­ity.

3. Mixed-Strat­egy Rat­ifi­a­bil­ity: The agents know the state of the de­ci­sion node. (This can be re­laxed to ap­prox­i­mate self-knowl­edge un­der some ad­di­tional as­sump­tions.)

4. Mixed-Strat­egy Im­ple­mentabil­ity: The ac­tion node doesn’t have any par­ents other than the de­ci­sion node.

I jus­tified as­sump­tion #2 as an ex­ten­sion of the de­sire to give EDT a fair trial: EDT is only clearly-defined in cases with ep­silon ex­plo­ra­tion, so I ar­gued that CDT and EDT should be com­pared with ep­silon-ex­plo­ra­tion. How­ever, if you pre­fer CDT be­cause EDT isn’t well-defined when con­di­tion­ing on prob­a­bil­ity zero ac­tions, this isn’t much of an ar­gu­ment.

We can now ad­dress this by re­quiring con­di­tion­als on prob­a­bil­ity zero events to be limits of se­quences of con­di­tion­als in which the event has greater than zero prob­a­bil­ity. Or (I think equiv­a­lently), we think of the prob­a­bil­ity dis­tri­bu­tion as be­ing the real part of a hy­per­real prob­a­bil­ity dis­tri­bu­tion.

Hav­ing done this, we can ap­ply the same CDT=EDT re­sult to Bayes nets with hy­per­real con­di­tional prob­a­bil­ity ta­bles. This shows that CDT still equals EDT with­out re­strict­ing to mixed strate­gies, so long as con­di­tion­als on zero-prob­a­bil­ity ac­tions are defined via limits.

This still leaves the other ques­tion­able as­sump­tions be­hind the CDT=EDT the­o­rem.

#1 (com­pat­i­ble prob­a­bil­ity & causal­ity): I framed this as­sump­tion as the main con­di­tion for a fair fight be­tween CDT and EDT: if the causal struc­ture is not com­pat­i­ble with the prob­a­bil­ity dis­tri­bu­tion, then you are ba­si­cally hand­ing differ­ent prob­lems to CDT and EDT and then com­plain­ing that one gets worse re­sults than the other. How­ever, the case is not so clear as I made it out to be. In cases where CDT/​EDT are in spe­cific de­ci­sion prob­lems which they un­der­stand well, the causal struc­ture and prob­a­bil­is­tic struc­ture must be com­pat­i­ble. How­ever, bound­edly ra­tio­nal agents will have in­con­sis­tent be­liefs, and it may be that be­liefs about causal struc­ture are some­times in­con­sis­tent with other be­liefs. An ad­vo­cate of CDT or EDT might say that the differ­en­ti­at­ing cases are on ex­actly such in­con­sis­tent ex­am­ples.

Although I agree that it’s im­por­tant to con­sider how agents deal with in­con­sis­tent be­liefs (that’s log­i­cal un­cer­tainty!), I don’t cur­rently think it makes sense to judge them on in­con­sis­tent de­ci­sion prob­lems. So, I’ll set aside such prob­lems.

No­tice, how­ever, that one might con­test whether there’s nec­es­sar­ily a rea­son­able causal struc­ture at all, and deny #1 that way.

#3 (rat­ifi­a­bil­ity): The rat­ifi­a­bil­ity as­sump­tion is a kind of equil­ibrium con­cept; the agent’s mixed strat­egy has to be in equil­ibrium with knowl­edge of that very mixed strat­egy. I ar­gued that it is as much a part of un­der­stand­ing the situ­a­tion the agent is in as any­thing else, and that it is usu­ally ap­prox­i­mately achiev­able (IE, doesn’t cause ter­rible self-refer­ence prob­lems or im­ply log­i­cal om­ni­science). How­ever, I didn’t prove that a rat­ifi­able equil­ibrium always ex­ists! Non-ex­is­tence would triv­ial­ize the re­sult, mak­ing it into an ar­gu­ment from false premises to a false con­clu­sion.

Jes­sica’s COEDT re­sults ad­dress this con­cern, show­ing that this level of self-knowl­edge is in­deed fea­si­ble.

#4 (im­ple­mentabil­ity): I think of this as the shak­iest as­sump­tion; it is easy to set up de­ci­sion prob­lems which vi­o­late it. How­ever, I tend to think such se­tups get the causal struc­ture wrong. Other par­ents of the ac­tion should in­stead be thought of as chil­dren of the ac­tion. Fur­ther­more, if an agent is learn­ing about the struc­ture of a situ­a­tion by re­peated ex­po­sure to that situ­a­tion, im­ple­mentabil­ity seems nec­es­sary for the agent to come to un­der­stand the situ­a­tion it is in: par­ents of the ac­tion will look like chil­dren if you try to perform ex­per­i­ments to see what hap­pens when you do differ­ent things.

I won’t provide any di­rect ar­gu­ments for the im­ple­mentabil­ity con­straint in the rest of this post, but I’ll be dis­cussing other con­nec­tions be­tween learn­ing and coun­ter­fac­tual rea­son­ing.

# Are We Really Elimi­nat­ing Ex­plo­ra­tion?

## Ways of Tak­ing Coun­ter­fac­tu­als are Some­what Interchangeable

When think­ing about de­ci­sion the­ory, we tend to fo­cus on putting the agent in a par­tic­u­lar well-defined prob­lem. How­ever, re­al­is­ti­cally, an agent has a large amount of un­cer­tainty about the struc­ture of the situ­a­tion it is in. So, a big part of get­ting things right is learn­ing what situ­a­tion you’re in.

Any rea­son­able way of defin­ing coun­ter­fac­tu­als for ac­tions, be it CDT or COEDT or some­thing else, is go­ing to be able to de­scribe es­sen­tially any com­bi­na­tion of con­se­quences for the differ­ent ac­tions. So, for an agent who doesn’t know what situ­a­tion it is in, any sys­tem of coun­ter­fac­tu­als is pos­si­ble no mat­ter how coun­ter­fac­tu­als are defined. In some sense, this means that get­ting coun­ter­fac­tu­als right will be mainly up to the learn­ing. Choos­ing be­tween differ­ent kinds of coun­ter­fac­tual rea­son­ing is a bit like choos­ing differ­ent pri­ors—you would hope it gets washed out by learn­ing.

## Ex­plo­ra­tion is Always Ne­c­es­sary for Learn­ing Guarantees

COEDT elimi­nates the need for ex­plo­ra­tion in 5-and-10, which in­tu­itively means cases where it should be re­ally, re­ally ob­vi­ous what to do. It isn’t clear to what ex­tent COEDT helps with other is­sues. I’m skep­ti­cal that COEDT alone will al­low us to get the right coun­ter­fac­tu­als for game-the­o­retic rea­son­ing. But, it is re­ally clear that COEDT doesn’t change the fun­da­men­tal trade-off be­tween learn­ing guaran­tees (via ex­plo­ra­tion) and Bayesian op­ti­mal­ity (with­out ex­plo­ra­tion).

This is illus­trated by the fol­low­ing prob­lem:

Scary Door Prob­lem. Ac­cord­ing to your prior, there is some chance that doors of a cer­tain red color con­ceal mon­sters who will de­stroy the uni­verse if dis­turbed. Your prior holds that this is not very strongly cor­re­lated to any facts you could ob­serve with­out open­ing such a door. So, there is no way to know whether such doors con­ceal uni­verse-de­stroy­ing mon­sters with­out try­ing them. If you knew such doors were free of uni­verse-de­stroy­ing mon­sters, there are var­i­ous rea­sons why you might some­times want to open them.

The scary door prob­lem illus­trates the ba­sic trade-off be­tween asymp­totic op­ti­mal­ity and sub­jec­tive op­ti­mal­ity. Ep­silon ex­plo­ra­tion would guaran­tee that you oc­ca­sion­ally open scary doors. If such doors con­ceal mon­sters, you de­stroy the uni­verse. How­ever, if you re­fuse to open scary doors, then it may be that you never learn to perform op­ti­mally in the world you’re in.

What COEDT does is show that the scary door and 5-and-10 re­ally are differ­ent sorts of prob­lem. If there weren’t ap­proaches like COEDT which elimi­nate the need for ex­plo­ra­tion in 5-and-10, we would be forced to con­clude that they’re the same: no mat­ter how easy the prob­lem looks, you have to ex­plore in or­der to learn the right coun­ter­fac­tu­als.

So, COEDT shows that not all coun­ter­fac­tual rea­son­ing has to re­duce to learn­ing. There are prob­lems you can get right by rea­son­ing alone. You don’t always have to ex­plore; you can re­fuse to open scary doors, while still re­li­ably pick­ing up \$10.

I men­tioned that choos­ing be­tween differ­ent no­tions of coun­ter­fac­tual is kind of like choos­ing be­tween differ­ent pri­ors—you might hope it gets washed out by learn­ing. The scary door prob­lem illus­trates why we might not want the learn­ing to be pow­er­ful enough to wash out the prior. This means get­ting the prior right is quite im­por­tant.

## You Still Ex­plore in Log­i­cal Time

If you fol­low the log­i­cal time anal­ogy, it seems like you can’t ever re­ally con­struct log­i­cal coun­ter­fac­tu­als with­out ex­plo­ra­tion in some sense: if you rea­son about a coun­ter­fac­tual, the coun­ter­fac­tual sce­nario ex­ists some­where in your log­i­cal past, since it is a real math­e­mat­i­cal ob­ject. Hence, you must take the al­ter­nate ac­tion some­times in or­der to rea­son about it at all.

So, how does a COEDT agent man­age not to ex­plore?

COEDT can be thought of as “learn­ing” from an in­finite se­quence of agents who ex­plore less and less. None of those agents are COEDT agents, but they get closer and closer. If each of these agents ex­ists at a finite log­i­cal time, COEDT ex­ists at an in­finite log­i­cal time, greater than any of the agents COEDT learns from. So, COEDT doesn’t need to ex­plore be­cause COEDT doesn’t try to learn from agents max­i­mally similar to it­self; it is OK with a sys­tem­atic differ­ence be­tween it­self and the refer­ence class it log­i­cally learns from.

This sys­tem­atic differ­ence may al­low us to drive a wedge be­tween the agent and its refer­ence class to demon­strate prob­le­matic be­hav­ior. I won’t try to con­struct such a case to­day.

In the COEDT post, Jes­sica says:

I con­sider COEDT to be ma­jor progress in de­ci­sion the­ory. Be­fore COEDT, there were (as far as I know) 3 differ­ent ways to solve 5 and 10, all based on coun­ter­fac­tu­als:
• Causal coun­ter­fac­tu­als (as in CDT), where coun­ter­fac­tu­als are wor­lds where phys­i­cal magic hap­pens to force the agent’s ac­tion to be some­thing spe­cific.
• Model-the­o­retic coun­ter­fac­tu­als (as in modal UDT), where coun­ter­fac­tu­als are mod­els in which false state­ments are true, e.g. where PA is in­con­sis­tent.
• Prob­a­bil­is­tic con­di­tion­als (as in re­in­force­ment learn­ing and log­i­cal in­duc­tor based de­ci­sion the­o­ries such as LIEDT/​LICDT and asymp­totic de­ci­sion the­ory), where coun­ter­fac­tu­als are pos­si­ble wor­lds as­signed a small but nonzero prob­a­bil­ity by the agent in which the agent takes a differ­ent ac­tion through “ex­plo­ra­tion”; note that ADT-style op­ti­mism is a type of ex­plo­ra­tion.
COEDT is a new way to solve 5 and 10. My best in­tu­itive un­der­stand­ing is that, whereas or­di­nary EDT (us­ing or­di­nary re­flec­tive or­a­cles) seeks any equil­ibrium be­tween be­liefs and policy, COEDT speci­fi­cally seeks a not-ex­tremely-un­sta­ble equil­ibrium (though not nec­es­sar­ily one that is sta­ble in the sense of dy­nam­i­cal sys­tems), where the equil­ibrium is “jus­tified” by the fact that there are ar­bi­trar­ily close al­most-equil­ibria. This is similar to trem­bling hand perfect equil­ibrium. To the ex­tent that COEDT has coun­ter­fac­tu­als, they are these wor­lds where the or­a­cle dis­tri­bu­tion is not ac­tu­ally re­flec­tive but is very close to the ac­tual or­a­cle dis­tri­bu­tion, and in which the agent takes a sub­op­ti­mal ac­tion with very small prob­a­bil­ity.

Based on my pic­ture, I think COEDT be­longs in the modal UDT class. Both pro­pos­als can be seen as a spe­cial sort of ex­plo­ra­tion where we ex­plore if we are in a non­stan­dard model. Mo­dal UDT ex­plores if PA is in­con­sis­tent. COEDT ex­plores if a ran­domly sam­pled pos­i­tive real in the unit in­ter­val hap­pens to be less than some non­stan­dard ep­silon. :)

(Note that de­scribing them in this way is a lit­tle mis­lead­ing, since it makes them sound un­com­putable. Mo­dal UDT in par­tic­u­lar is quite com­putable, if the de­ci­sion prob­lem has the right form and if we are happy to as­sume that PA is con­sis­tent.)

I’ll be cu­ri­ous to see how well this anal­ogy holds up. Will COEDT have fun­da­men­tally new be­hav­ior in some sense?

More thoughts to fol­low in Part 2.

No nominations.
No reviews.
• COEDT can be thought of as “learn­ing” from an in­finite se­quence of agents who ex­plore less and less.

In­ter­est­ingly, the is­sue COEDT has with se­quen­tial de­ci­sion prob­lems looks sus­pi­ciously similar to folk the­o­rems in iter­ated game the­ory (which also im­ply that com­pletely-al­igned agents can get a very bad out­come be­cause they will each max­i­mally pun­ish any­one who doesn’t play the grim trig­ger strat­egy). There might be some kind of folk the­o­rem for COEDT, though there’s a com­pli­ca­tion in that, con­di­tion­ing on your­self tak­ing a prob­a­bil­ity-0 ac­tion, you get both wor­lds where you are be­ing pun­ished and ones where you are pun­ish­ing some­one else, which might mean coun­ter­fac­tual pun­ish­ments can’t be max­i­mal for ev­ery­one at once (yay?).

COEDT en­sures that con­di­tion­als on each ac­tion ex­ist at all, but it doesn’t en­sure that agents be­have even re­motely sanely in these con­di­tion­als, as it’s still con­di­tion­ing on a very rare event, and the rele­vant ra­tio­nal­ity con­di­tions per­mit agents to be­have in­sanely with very small prob­a­bil­ity. What would be re­ally nice is to get some set of con­di­tional be­liefs un­der which:

• no one takes any strictly dom­i­nated ac­tions with nonzero prob­a­bil­ity (i.e. an ac­tion such that all pos­si­ble wor­lds where the agent takes this ac­tion are worse than all pos­si­ble wor­lds where the agent doesn’t)

• con­di­tional on any sub­set of the agents tak­ing non-strictly-dom­i­nated ac­tions, no agent takes any strictly dom­i­nated ac­tion with nonzero probability

(I sus­pect this is eas­ier for com­mon-pay­off prob­lems; for non-com­mon-pay­off prob­lems, agents might take strictly dom­i­nated ac­tions as a form of ex­tor­tion)

COEDT doesn’t get this but per­haps a similar con­struc­tion (maybe us­ing the hy­per­re­als?) does.

• So, I haven’t re­ally read this in any de­tail, but—I am very, very wary of the use of hy­per­real and/​or sur­real num­bers here. While as I said I haven’t taken a thor­ough look at this, to me these look like “well we need in­finites­i­mals and this is what I’ve heard of” rather than hav­ing any real rea­son to pick one of these two. I se­ri­ously doubt that ei­ther is a good choice.

Hyper­re­als re­quire pick­ing a free ul­tra­filter; they’re not even uniquely defined. Sur­real num­bers (pretty much) com­pletely break limits. (Hyper­re­als kind of break limits too, due to be­ing of un­countable cofi­nal­ity, but not nearly as ex­ten­sively as sur­real num­bers do, which are of proper-class cofi­nal­ity.) If you’re pick­ing a num­ber sys­tem, you need to con­sider what you’re ac­tu­ally go­ing to do with it. If you’re go­ing to do any sort of limits or in­te­gra­tion with it—and what else is prob­a­bil­ity for, if not in­te­gra­tion? -- you prob­a­bly don’t want sur­real num­bers, be­cause limits are not go­ing to work there. (Some things that are nor­mally done with limits can be re­cov­ered for sur­re­als by other means, e.g. there’s a sur­real ex­po­nen­tial, but you don’t define it as a limit of par­tial sums, be­cause that doesn’t work. So, maybe you can de­velop the nec­es­sary the­ory based on some­thing other than limits, but I’m pretty sure it’s not some­thing that already ex­ists which you can just pick up and use.)

Again: Pick num­ber sys­tems for what they do. Hyper­re­als have a spe­cific point, which is the trans­fer prin­ci­ple. If you’re not go­ing to be us­ing the trans­fer prin­ci­ple, you prob­a­bly don’t want hy­per­re­als. And as I already said, if you’re go­ing to be tak­ing any sort of limit, you prob­a­bly don’t want sur­re­als.

Con­sider ask­ing whether you need a sys­tem of num­bers at all. You men­tion se­quences of real num­bers; per­haps that’s sim­ply what you want? Se­quences of real num­bers, not mod­ulo a free ul­tra­filter? You don’t need to use an ex­ist­ing sys­tem of num­bers, you can pur­pose-build one; and you don’t need to use a sys­tem of num­bers at all, you can just use ap­pro­pri­ate ob­jects, what­ever they may be. (Of­ten­time it makes more sense to rep­re­sent “or­ders of in­finity” by func­tions of differ­ent growth rates—or, I guess here, se­quences of differ­ent growth rates.)

(Hon­estly if in­finites­i­mal prob­a­bil­ities or util­ities are com­ing up, I’d con­sider that a flag that some­thing has likely gone wrong—we have good rea­sons to use real num­bers for these, which I’m sure you’re already fa­mil­iar with (but here’s a link for ev­ery­one else :P ) -- but I’ll ad­mit that I haven’t read this thing in any de­tail and you are go­ing be­yond that sort of clas­si­cal con­text so, hey, who knows.)

• #4 (im­ple­mentabil­ity): I think of this as the shak­iest as­sump­tion; it is easy to set up de­ci­sion prob­lems which vi­o­late it. How­ever, I tend to think such se­tups get the causal struc­ture wrong. Other par­ents of the ac­tion should in­stead be thought of as chil­dren of the ac­tion. Fur­ther­more, if an agent is learn­ing about the struc­ture of a situ­a­tion by re­peated ex­po­sure to that situ­a­tion, im­ple­mentabil­ity seems nec­es­sary for the agent to come to un­der­stand the situ­a­tion it is in: par­ents of the ac­tion will look like chil­dren if you try to perform ex­per­i­ments to see what hap­pens when you do differ­ent things.

This as­sump­tion seems sketchy to me. In par­tic­u­lar, what if you make 2 copies of a de­ter­minis­tic agent, move them phys­i­cally far from each other, give them the same in­for­ma­tion, and ask each to se­lect an ac­tion? Clearly, if a ra­tio­nal agent is un­cer­tain about ei­ther agent’s ac­tion, then they will be­lieve the two agents’ ac­tions to be (perfectly) cor­re­lated. The two ac­tions can’t each be chil­dren of each other...

• I maybe should have clar­ified that when I say CDT I’m refer­ring to a steel-man CDT which would use some no­tion of log­i­cal causal­ity. I don’t think the phys­i­cal coun­ter­fac­tu­als are a live hy­poth­e­sis in our cir­cles, but sev­eral peo­ple ad­vo­cate rea­son­ing which looks like log­i­cal causal­ity.

Im­ple­mentabil­ity as­serts that you should think of your­self as logico-causally con­trol­ling your clone when it is a perfect copy.

• If your de­ci­sion logico-causally con­trols your clone’s de­ci­sion and vice versa, doesn’t that im­ply a non-causal model (since it has a cy­cle)?

In the case of an ex­act clone this is less of an is­sue since there’s only one rele­vant log­i­cal fact. But in cases where some­thing like cor­re­lated equil­ibrium is be­ing em­u­lated on log­i­cal un­cer­tainty (as in this post), the de­ci­sions could be log­i­cally cor­re­lated with­out be­ing iden­ti­cal.

[EDIT: in the case of cor­re­lated equil­ibrium speci­fi­cally, there ac­tu­ally is a sig­nal (which ac­tion you are told to take), and your ac­tion is con­di­tion­ally in­de­pen­dent of ev­ery­thing else given this sig­nal, so there isn’t a prob­lem. How­ever, in COEDT, each agent knows the or­a­cle dis­tri­bu­tion but not the or­a­cle it­self, which means they con­sider their own ac­tion to be cor­re­lated with other agents’ ac­tions.]

• Sur­real num­bers are gen­er­ally bet­ter than hy­per­re­als for many use cases as hy­per­re­als aren’t unique and re­quire you to choose a prin­ci­ple ul­tra­filter or some­thing.

• I’ve already men­tioned this in a sep­a­rate com­ment, but sur­re­als come with a lot of prob­lems of their own (ba­si­cally, limits don’t work). I don’t like to say this, but your com­ment gives off the same “oh well we need in­finites­i­mals and this is what I’ve heard of” im­pres­sion as above. Pick sys­tems of num­bers based on what they do. Sur­re­als prob­a­bly don’t do what­ever’s nec­es­sary here—how are you go­ing to do any sort of in­te­gra­tion?

(Also, you mean a free ul­tra­filter, not a prin­ci­pal one.)

• They solve a lot more prob­lems than peo­ple re­al­ise. I’ll be mak­ing a post on this soon. And what’s the is­sue re: limits not work­ing?

• Later ed­its: var­i­ous ed­its for clar­ity; also the “trans­finite se­quences suffice” thing is easy to ver­ify, it doesn’t re­quire some ex­otic the­o­rem
Yet later edit: Added an­other ex­am­ple

So, to a large ex­tent this is a prob­lem with non-Archimedean or­dered fields in gen­eral; the sur­re­als just ex­ac­er­bate it. So let’s go through this in stages.

===Stage 1: In­finites­i­mals break limits===

Let’s start with an ex­am­ple. In the real num­bers, the limit as n goes to in­finity of 1/​n is 0. (Here n is a nat­u­ral num­ber, to be clear.)

If we in­tro­duce in­finites­i­mals—even just as min­i­mally as, say, pass­ing to R(ω) -- that’s not so, be­cause if you have some in­finites­i­mal ε, the se­quence will not get within ε of 0.

Of course, that’s not nec­es­sar­ily a prob­lem; I mean, that’s just restat­ing that our or­dered field is no longer Archimedean, right? Of course 1/​n is no longer go­ing to go to 0, but is 1/​n re­ally the right thing to be look­ing at? How about, say, 1/​x, as x goes to in­finity, where x takes val­ues in this field of ours? That still goes to 0. So it may seem like things are fine, like we just need to get these se­quences out of our head and make sure we’re always tak­ing limits of func­tions, not se­quences.

But that’s not always so easy to do. What if we look at x^n, where |x|<1? If x isn’t in­finites­i­mal, that’s no longer go­ing to go to 0. It may still go to 0 in some cases—like, in R(ω), cer­tainly 1/​ω^n will still go to 0 -- but 1/​2^n sure won’t. And what do we re­place that with? 1/​2^x? How do we define that? In cer­tain set­tings we may be able to—hell, there’s a the­ory of the sur­real ex­po­nen­tial, so in the sur­re­als we can—but not in gen­eral. And do­ing that re­quires first in­vent­ing the sur­real ex­po­nen­tial, which—well, I’ll talk more about that later, but, hey, let’s talk about that a bit right now. How are we go­ing to define the ex­po­nen­tial? Nor­mally we define exp(x) to be the limit of 1, 1+x, 1+x+x^2/​2… but that’s not go­ing to work any­more. If we try to take exp(1), ex­pect­ing an an­swer of e, what we get is that the se­quence doesn’t con­verge due to the cloud of in­finites­i­mals sur­round­ing it; it’ll never get within 1/​ω of e. For some val­ues maybe it’ll con­verge, but not enough to do what we want.

Now the ex­po­nen­tial is nice, so maybe we can find an­other defi­ni­tion (and, as men­tioned, in the case of the sur­re­als in­deed we can, while ob­vi­ously in the case of the hy­per­re­als we can do it com­po­nen­t­wise). But other cases can be much worse. In­tro­duc­ing in­finites­i­mals doesn’t break limits en­tirely—but it likely breaks the limits that you’re count­ing on, and that can be fatal on its own.

===Stage 2: Un­countable cofi­nal­ity breaks limits harder===

Stage 2 is re­ally just a slight elab­o­ra­tion of stage 1. Once your field is large enough to have un­countable cofi­nal­ity—like, say, the hy­per­re­als—no se­quence (with do­main the whole num­bers) will con­verge (un­less it’s even­tu­ally con­stant). If you want to take limits, you’ll need trans­finite se­quences of un­countable length, or you sim­ply will not get con­ver­gence.

Again, when you can rephrase things from se­quences (with do­main the nat­u­ral num­bers) to func­tions (with do­main your field), things are fine. Be­cause ob­vi­ously your field’s cofi­nal­ity is equal to it­self. But you can’t always do that, or at least not so eas­ily. Again: It would be nice if, for |x|<1, we had x^n ap­proach­ing 0, and once we hit un­countable cofi­nal­ity, that is sim­ply not go­ing to hap­pen for any nonzero x.

(A note: In gen­eral in topol­ogy, not even trans­finite se­quences are good enough for gen­eral limits, and you need nets/​filters. But for or­dered fields, trans­finite se­quences (of length equal to the field’s cofi­nal­ity) are suffi­cient. Hence the fo­cus on trans­finite se­quences rather than be­ing ul­tra-gen­eral and us­ing nets.)

Note that of course the hy­per­re­als are used for non­stan­dard anal­y­sis, but non­stan­dard anal­y­sis doesn’t in­volve tak­ing limits in the hy­per­re­als—that’s the point; limits in the re­als cor­re­spond to non-limit-based things in the hy­per­re­als.

===Stage 3: The sur­re­als break limits as hard as is pos­si­ble===

So now we have the sur­re­als, which take un­countable cofi­nal­ity to the ex­treme. Our cofi­nal­ity is no longer merely un­countable, it’s not even an ac­tual or­di­nal! The “cofi­nal­ity” of the sur­re­als is the “or­di­nal” rep­re­sented by the class of all or­di­nals (or the “car­di­nal” of the class of all sets, if you pre­fer to think of cofi­nal­ities as car­di­nals). We have proper-class cofi­nal­ity.

Limits of se­quences are gone. Limits of or­di­nary trans­finite se­quences are gone. All that re­mains work­ing are limits of se­quences whose do­main con­sists of the en­tire class of all or­di­nals. Or, again, other things with proper-class cofi­nal­ity; 1/​x still goes to 0 as x goes to in­finity (again, let­ting x range over all sur­re­als—note that that that’s a very strong no­tion of “goes to in­finity”!) You still have limits of sur­real func­tions of a sur­real vari­able. But as I keep point­ing out, that’s not always good enough.

I mean, re­ally—in terms of or­dered fields, the real num­bers are the best pos­si­ble set­ting for limits, be­cause of the ex­is­tence of suprema. Every set that’s bounded above has a least up­per bound. By con­trast, in the sur­re­als, no set that’s bounded above has a least up­per bound! That’s kind of their defin­ing prop­erty; if you have a set S and an up­per bound b then, oops, {S|b} sneaks right in­be­tween. Proper classes can have suprema, yes, but, as I keep point­ing out, you don’t always have a proper class to work with; of­ten­times you just have a plain old countably in­finite set. As such, in con­trast to the re­als, the sur­real num­bers are the worst pos­si­ble set­ting for limits.

The re­sult is that do­ing things with sur­re­als be­yond ad­di­tion and mul­ti­pli­ca­tion typ­i­cally re­quires ba­si­cally rein­vent­ing those things. Now, of course, the sur­real num­bers have some­thing that vaguely re­sem­ble limits, namely, {left stuff|right stuff} -- the “sim­plest in an in­ter­val” con­struc­tion. I mean, if you want, say, √2, you can just put {x∈Q, x>0, x^2<2 | x∈Q, x>0, x^2>2}, and, hey, you’ve got √2! Looks al­most like a limit, doesn’t it? Or a Dedekind cut? Sure, there’s a huge cloud of in­finites­i­mals sur­round­ing √2 that will thwart at­tempts at limits, but the sim­plest-in-an-in­ter­val con­struc­tion cuts right through that and snaps to the sim­plest thing there, which is of course √2 it­self, not √2+1/​ω or some­thing.

Added later: Similarly, if you want, say, ω^ω, you just take {ω,ω^2,ω^3,...|}, and you get ω^ω. Once again, it gets you what a limit “ought” to get you—what it would get you in the or­di­nals—even though an ac­tual limit wouldn’t work in this set­ting.

But the prob­lem is, de­spite these sug­ges­tive ex­am­ples show­ing that snap­ping-to-the-sim­plest looks like a limit in some cases, it’s ob­vi­ously the wrong thing in oth­ers; it’s not some gen­eral drop-in sub­sti­tute. For in­stance, in the real num­bers you define exp(x) as the limit of the se­quence 1, 1+x, 1+x+x^2/​2, etc. In the sur­re­als we already know that won’t work, but if you make the novice mis­take in fix­ing it of in­stead try­ing to define exp(x) as {1,1+x,1+x+x^2/​2,...|}, you will get not exp(1)=e but rather exp(1)=3. Oops. We didn’t want to snap to some­thing quite that sim­ple. And that’s hard to pre­vent.

You can do it—there is a the­ory of the sur­real ex­po­nen­tial—but it re­quires care. And it re­quires ba­si­cally rein­vent­ing what­ever the­ory it is that you’re try­ing to port over to the sur­real num­bers, it’s not a nice straight port like so many other things in math­e­mat­ics. It’s been done for a num­ber of things! But not, I think, for the things you need here.

Martin Kruskal tried to de­velop a the­ory of sur­real in­te­gra­tion back in the 70s; he ul­ti­mately failed, and I’m pretty sure no­body has suc­ceeded since. And note that this was for sur­real func­tions of a sin­gle sur­real vari­able. For sur­real util­ities and real prob­a­bil­ities you’d need sur­real func­tions on a mea­sure space, which I imag­ine would be harder, ba­si­cally for cofi­nal­ity rea­sons. And for this thing, where I guess we’d have some­thing like sur­real prob­a­bil­ities… well, I guess the cofi­nal­ity is­sue gets eas­ier—or maybe gets eas­ier, I don’t want to say that it does—but it raises so many oth­ers. Like, if you can do that, you should at least be able to do sur­real func­tions of a sin­gle sur­real vari­able, right? But at the mo­ment, as I said, no­body knows how (I’m pretty sure).

In short, while you say that the sur­re­als solve a lot more prob­lems than peo­ple re­al­ize, my point of view is ba­si­cally the op­po­site: From the point of view of ap­pli­ca­tions, the sur­real num­bers are ba­si­cally an at­trac­tive nui­sance. Peo­ple are drawn to them for ob­vi­ous rea­sons—sur­re­als are cool! Sur­re­als are fun! They in­clude, in­for­mally speak­ing, all the in­fini­ties and in­fites­i­mals! But they can be a huge pain to work with, and—much more im­por­tantly—what­ever it is you need them to do, they prob­a­bly don’t do it. “In­cludes all the in­fini­ties and in­finites­i­mals” is prob­a­bly not ac­tu­ally on your list of re­quire­ments; while if you’re try­ing to do any sort of de­ci­sion the­ory, some sort of the­ory of in­te­gra­tion is.

You have ba­si­cally no idea how many times I’ve had to write the same “no, you re­ally don’t want to use sur­real util­ities” com­ment here on LW. In fact years ago—ba­si­cally due to con­stant abuse of sur­re­als (or car­di­nals, if peo­ple re­ally didn’t know what they were talk­ing about) -- I wrote this ar­ti­cle here on LW, and (while it’s not like peo­ple are likely to hap­pen across that any­way) I wish I’d in­cluded more of a warn­ing against us­ing the sur­re­als.

Ba­si­cally, I would say, go where the math tells you to; build your sys­tem to the re­quire­ments, don’t just go pul­ling some­thing off the shelf un­less it meets those re­quire­ments. And note that what you build might not be a sys­tem of num­bers at all. I think peo­ple are of­ten too quick to jump to the use of num­bers in the first place. Real num­bers get a lot of this, be­cause peo­ple are fa­mil­iar with them. I sus­pect that’s the real his­tor­i­cal rea­son why util­ity func­tions were ini­tially defined as real-val­ued; we’re lucky that they turned out to ac­tu­ally be ap­pro­pri­ate!

(Added later: There is one other thing you can do in the sur­re­als that kind of re­sem­bles a limit, and this is to take a limit of sign se­quences. This at least doesn’t have the cofi­nal­ity prob­lem; you can take a sign-se­quence limit of a se­quence. But this is not any sort of drop-in re­place­ment for usual limits ei­ther, and my im­pres­sion (not an ex­pert here) is that it doesn’t re­ally work very well at all in the first place. My im­pres­sion is that, while {left|right} can be a bit too oblivi­ous to the de­tails of the the in­puts (if you’re not care­ful), limits of sign se­quences are a bit too finicky. For in­stance, defin­ing e to be the sign-se­quence limit of the par­tial sums 1, 2, 52, 83, 6524… will work, but defin­ing exp(x) analo­gously won’t, be­cause what if x is (as a real num­ber) the log­a­r­ithm of a dyadic ra­tio­nal? In­stead of get­ting exp(log(2))=2, you’ll get exp(log(2))=2-1/​ω. (I’m pretty sure that’s right.) There goes mul­ti­plica­tivity! Worse yet, exp(-log(2)) won’t “con­verge” at all. Again, I can’t rule out that, like {left|right}, it can be made to work with some care, but it’s definitely not a drop-in re­place­ment, and my non-ex­pert im­pres­sion is that it’s over­all worse than {left|right}. In any case, once again, the bet­ter choice is al­most cer­tainly not to use sur­re­als.)

• That’s an ex­cep­tion­ally in­for­ma­tive com­ment!

Do you know where I could find proofs of the fol­low­ing?

“Nor­mally we define exp(x) to be the limit of 1, 1+x, 1+x+x^2/​2, it’ll never get within 1/​ω of e.”

“If you make the novice mis­take in fix­ing it of in­stead try­ing to define exp(x) as {1,1+x,1+x+x^2/​2,...|}, you will get not exp(1)=e but rather exp(1)=3.”

I still need to read more about sur­real num­bers, but the thing I like about them is that you can always re­duce the re­s­olu­tion if you can’t solve the equa­tion in the sur­re­als. In some ways I view them as the ul­ti­mate re­al­ity and if we don’t know the an­swer to some­thing or only know the an­swer to a cer­tain fine­ness, I think it’s bet­ter to be hon­est about, rather than just fall back to an equiv­alence class over the sur­re­als where we do know the an­swer. Ac­tu­ally, maybe that wasn’t quite clear, I’m fine with fal­ling back, but af­ter its clear that we can’t solve it to the finest de­gree.

• (Note: I’ve ed­ited some things in to be clearer on some points.)

Do you know where I could find proofs of the fol­low­ing?

“Nor­mally we define exp(x) to be the limit of 1, 1+x, 1+x+x^2/​2, it’ll never get within 1/​ω of e.”

“If you make the novice mis­take in fix­ing it of in­stead try­ing to define exp(x) as {1,1+x,1+x+x^2/​2,...|}, you will get not exp(1)=e but rather exp(1)=3.”

Th­ese are both pretty straight­for­ward. For the first, say we’re work­ing in a non-Archimedean or­dered field which con­tains the re­als, we take the par­tial sums of the se­ries 1+1+1/​2+1/​6+...; these are ra­tio­nal num­bers, in par­tic­u­lar they’re real num­bers. So if we have one of these par­tial sums, call it s, then e-s is a pos­i­tive real num­ber. So if you have some in­finites­i­mal ε, it’s larger than ε; that’s what an in­finites­i­mal is. The se­quence will not get within ε of e.

For the sec­ond, note that 3={2|}, i.e., it’s the sim­plest num­ber larger than 2. So if you have {1,2,5/​2,8/​3,...|}, well, the sim­plest num­ber larger than all of those is still 3, be­cause you did noth­ing to ex­clude 3. 3 is a very sim­ple num­ber! By defi­ni­tion, if you want to not get 3, ei­ther your in­ter­val has to not con­tain 3, or it has to con­tain some­thing even sim­pler than 3 (i.e., 2, 1, or 0). (This is easy to see if you use the sign-se­quence rep­re­sen­ta­tion—re­mem­ber that x is sim­pler than y iff the sign se­quence of x is a proper pre­fix of the sign se­quence of y.) The in­ter­val of sur­re­als greater than those par­tial sums does con­tain 3, and does not con­tain 2, 1, or 0. So you get 3. That’s all there is to it.

As for the rest of the com­ment… let me ad­dress this out of or­der, if you don’t mind:

In some ways I view them as the ul­ti­mate reality

See, this is ex­actly the sort of think­ing I’m try­ing to head off. How is that rele­vant to any­thing? You need to use some­thing that ac­tu­ally fulfills the re­quire­ments of the prob­lem.

On top of that, this seems… well, I don’t know if you ac­tu­ally are mak­ing this er­ror, but it seems rather rem­i­nis­cent of the high school stu­dent’s er­ror of imagn­ing that there’s a sin­gle no­tion of “num­ber”—where ev­ery no­tion of “num­ber” they know fits in C so “num­ber” and “com­plex num­ber” be­come iden­ti­fied. And this is false not just be­cause you can go be­yond C, but be­cause there are sys­tems of num­bers that can’t be fit to­gether with C at all. (How does Q_p fit into this? An­swer: It doesn’t!)

(Ac­tu­ally, by that stan­dard, shouldn’t the sur­com­plexes be the “ul­ti­mate re­al­ity”? :) )

(...I ac­tu­ally have some thoughts on that sort of thing, but since I’m try­ing to point out right now that that sort of thing is not what you should be think­ing about when de­ter­min­ing what sort of space to use, I won’t go into them. “Ul­ti­mate re­al­ity” is, in ad­di­tion to not be­ing cor­rect, prob­a­bly not on the list of re­quire­ments!)

Also, y’know, you don’t nec­es­sar­ily need some­thing that could be con­sid­ered “num­bers” at all, as I keep em­pha­siz­ing.

Any­way, as to the math­e­mat­i­cal part of what you were say­ing...

I still need to read more about sur­real num­bers, but the thing I like about them is that you can always re­duce the re­s­olu­tion if you can’t solve the equa­tion in the sur­re­als. In some ways I view them as the ul­ti­mate re­al­ity and if we don’t know the an­swer to some­thing or only know the an­swer to a cer­tain fine­ness, I think it’s bet­ter to be hon­est about, rather than just fall back to an equiv­alence class over the sur­re­als where we do know the an­swer. Ac­tu­ally, maybe that wasn’t quite clear, I’m fine with fal­ling back, but af­ter its clear that we can’t solve it to the finest de­gree.

I have no idea what you’re talk­ing about here. Like, what? First off, what sort of equa­tions are you talk­ing about? Alge­braic ones? Over the sur­re­als, I guess? The sur­re­als are a real closed field, the sur­com­plexes are alge­braically closed. That will suffice for alge­braic equa­tions. Maybe you mean some more gen­eral sort, I don’t know.

But most of this is just baf­fling. I have no idea what you’re talk­ing about when you speak of pass­ing to a quo­tient of the sur­re­als to solve any equa­tion. Where is that com­ing from? And like—what sort of quo­tient are we talk­ing about here? “Quo­tient of the sur­re­als” is already sus­pect be­cause, well, it can’t be a ring-the­o­retic quo­tient, as fields don’t have non­triv­ial ideals, at all. So I guess you mean purely an ad­di­tive quo­tient? But that’s not go­ing to mix very well with solv­ing any equa­tions that in­volve more than ad­di­tion now, is it? Mean­while what the sur­re­als are known for is that any or­dered field em­beds in them, not some­thing about quo­tients!

Any­way, if you want to solve alge­braic equa­tions, you want an alge­braically closed field. If you want to solve alge­braic equa­tions to the great­est ex­tent pos­si­ble while still keep­ing things or­dered, you want a real closed field. The sur­re­als are a real closed field, but you cer­tainly don’t need them just for solv­ing equa­tions. If you want to be able to do limits and calcu­lus and such, you want some­thing with a nice topol­ogy (just how nice prob­a­bly de­pends on just what you want), but note that you don’t nec­es­sar­ily want a field at all! None of these things fa­vor the sur­re­als, and the fact that we al­most cer­tainly need in­te­gra­tion here is a huge strike against them.

Btw, you know what’s great for solv­ing equa­tions in, even if they aren’t just alge­braic equa­tions? The real num­bers. Be­cause they’re con­nected, so you have the in­ter­me­di­ate value the­o­rem. And they’re the only or­dered field that’s con­nected. Again, you might be able to em­u­late that sort of thing to some ex­tent in the sur­re­als for suffi­ciently nice func­tions (mere con­ti­nu­ity won’t be enough) (cer­tainly you can for polyno­mi­als, like I said they’re real closed, but I’m guess­ing you can prob­a­bly get more than that), I’m not su­per-fa­mil­iar with just what’s pos­si­ble there, but it’ll take more work. In the re­als it’s just, make some com­par­i­sons, they come out op­po­site one an­other, R is con­nected, boom, there’s a solu­tion some­where in­be­tween.

But mostly I’m just won­der­ing where like any of this is com­ing from. It nei­ther seems to make much sense nor to re­sem­ble any­thing I know.

(Edit: And, once again, it’s not at all clear that be­ing able to solve equa­tions is at all rele­vant! That just doesn’t seem to be some­thing that’s re­quired. Whereas in­te­gra­tion is.)

• “So if we have one of these par­tial sums, call it s, then e-s is a pos­i­tive real num­ber. So if you have some in­finites­i­mal ε, it’s larger than ε; that’s what an in­finites­i­mal is”—Are you sure this chain of rea­son­ing is cor­rect? Con­sider 1/​2x. For any finite num­ber of terms it will be greater than ε, but as x ap­proaches ω, it should ap­proach 1/​2ω. Why can’t the par­tial sum get within 1/​ω of e?

“But it seems rather rem­i­nis­cent of the high school stu­dent’s er­ror of imag­in­ing that there’s a sin­gle no­tion of “num­ber”″ - okay, the term “ul­ti­mate re­al­ity” is a stretch. I can’t imag­ine all pos­si­ble ap­pli­ca­tions, so I can’t imag­ine all pos­si­ble num­ber­ing sys­tems. My point is that we don’t just want to use a seper­ate num­ber­ing sys­tem for each in­di­vi­d­ual prob­lem. We want to be philo­soph­i­cally con­sis­tent and so there should be broad classes of prob­lems for which we can iden­tify a sin­gle num­ber­ing sys­tem as suffi­cient. And there’s a huge set of prob­lems (which I’m not go­ing to even at­tempt to spec out here) for which sur­re­als can be jus­tified on a philo­soph­i­cal level, even if it is con­ve­nient to drop down to an­other num­ber sys­tem for the ac­tual calcu­la­tions. Maybe an ex­am­ple will help, New­to­nian physics and Spe­cial Rel­a­tivity em­bed­ded in Gen­eral Rel­a­tivity. Gen­eral rel­a­tivity pro­vides con­sis­tency for physics, even though we use one of the first two for the ma­jor­ity of calcu­la­tions.

“I have no idea what you’re talk­ing about here. Like, what?”

You’re right that it won’t be a nice neat quo­tient group. But here’s an ex­am­ple. N_0 - N_0 can equal any in­te­ger where N_0 is a car­di­nal, or even +/​- N_0, but in sur­real num­bers it works as fol­lows. Sup­pose X and Y are countable in­fini­ties. Then X—Y has a unique value that we can some­times iden­tify. For ex­am­ple, if X rep­re­sents the length of a se­quence and Y is all the el­e­ments in the se­quences ex­cept one, then X—Y = 1. We can perform the calcu­la­tion in the sur­re­als, or we can perform it in the car­di­nals and re­ceive a broad range of pos­si­ble an­swers. But for ev­ery pos­si­ble an­swer in the car­di­nals, we can find pairs of sur­real num­bers that would provide that an­swer.

• (Hey, a note, you should prob­a­bly learn to use the block­quote fea­ture. I dunno where it is in the rich text ed­i­tor if you’re us­ing that, but if you’re us­ing the Mark­down ed­i­tor you just pre­cede the para­graph you’re quot­ing with a “>”. It will make your posts sub­stan­tially more read­able.)

Are you sure this chain of rea­son­ing is cor­rect?

Yes.

Con­sider 1/​2x. For any finite num­ber of terms it will be greater than ε, but as x ap­proaches ω, it should ap­proach 1/​2ω.

What “terms”? What are you talk­ing about? This isn’t a se­quence or a sum; there are no “terms” here. Yes, even in the sur­re­als, as x goes to ω, 1/​(2x) will ap­proach 1/​(2ω), as you say; as I men­tioned above, limits of func­tions of a sur­real vari­able will in­deed still work. But that has no rele­vance to the case un­der dis­cus­sion.

(And, while it’s not nec­es­sary to see what’s go­ing on here, it may be helpful to re­mem­ber that if we if we in­ter­pret this as oc­cur­ring in the sur­re­als, then in the case of 1/​2x as x→ω, your do­main has proper-class cofi­nal­ity, while in the case of this in­finite sum, the do­main has cofi­nal­ity ω. So the former can work, and the lat­ter can­not. Again, one doesn’t need this to see that—the par­tial sum can’t get within 1/​ω of e even when the cofi­nal­ity is countable—but it may be worth re­mem­ber­ing.)

Why can’t the par­tial sum get within 1/​ω of e?

Be­cause the par­tial sum is always a ra­tio­nal num­ber. A ra­tio­nal num­ber—more gen­er­ally, a real num­ber—can­not be in­finites­i­mally close to e with­out be­ing e. (By con­trast, for sur­real x, 1/​(2x) cer­tainly does not need to be a real num­ber, and so can get in­finites­i­mally close to 1/​(2ω) with­out be­ing equal to it.)

You’re right that it won’t be a nice neat quo­tient group. But here’s an ex­am­ple. N_0 - N_0 can equal any in­te­ger where N_0 is a car­di­nal, or even +/​- N_0, but in sur­real num­bers it works as fol­lows. Sup­pose X and Y are countable in­fini­ties. Then X—Y has a unique value that we can some­times iden­tify. For ex­am­ple, if X rep­re­sents the length of a se­quence and Y is all the el­e­ments in the se­quences ex­cept one, then X—Y = 1. We can perform the calcu­la­tion in the sur­re­als, or we can perform it in the car­di­nals and re­ceive a broad range of pos­si­ble an­swers. But for ev­ery pos­si­ble an­swer in the car­di­nals, we can find pairs of sur­real num­bers that would provide that an­swer.

What??

OK. Look. I could spend my time at­tempt­ing to pick this apart. But, let me be blunt, the point I am try­ing to get across here is that you are talk­ing non­sense. This is bab­ble. You are way out of your depth, dude. You don’t know what you are talk­ing about. You need to go back and re­learn this from the be­gin­ning. I don’t even know what mis­take you’re mak­ing, be­cause it’s not a com­mon one I rec­og­nize.

Just in the hopes it might be some­what helpful, I will quickly go over the things I can maybe ad­dress quickly:

N_0 - N_0 can equal any in­te­ger where N_0 is a car­di­nal, or even +/​- N_0, but in sur­real num­bers it works as fol­lows.

I have no idea what this sen­tence is talk­ing about.

Sup­pose X and Y are countable in­fini­ties.

What’s an “in­finity”? An or­di­nal? A car­di­nal? (There’s only one countably in­finite car­di­nal...) A sur­real or some­thing else en­tirely? You said “countable”, so it has to be some­thing to which the no­tion of countabil­ity ap­plies!

This mis­take, at least, I think I can iden­tify. Maybe you should, in fact, look over that “quick guide to the in­finite” I wrote, be­cause this is myth #0 I dis­cussed there. There’s no such thing as a unified no­tion of “in­fini­ties”. There are differ­ent sys­tems of num­bers, some of them con­tain num­bers/​ob­jects that are in­finite (i.e.: larger in mag­ni­tude than any whole num­ber), there is not some greater unified sys­tem they are all a part of.

Then X—Y has a unique value that we can some­times iden­tify.

What is X-Y? I don’t even know what sys­tem of num­bers you’re us­ing, so I don’t know what this means.

If X and Y are sur­re­als, then, sure, there’s quite definitely a unique sur­real X-Y. This is true more gen­er­ally if you’re think­ing of X and Y as liv­ing in some sort of or­dered field or ring.

If X and Y are car­di­nals, then X-Y may not be well-defined. Triv­ially so if Y>X (no pos­si­ble val­ues), but let’s ig­nore that case. Even ig­nor­ing that, if X and Y are in­finite, X-Y may fail to be well-defined due to hav­ing mul­ti­ple pos­si­ble val­ues.

If X and Y are or­di­nals, we have to ask what sort of ad­di­tion we’re us­ing. If we’re us­ing nat­u­ral ad­di­tion, then X-Y cer­tainly has a unique value in the sur­re­als, but it may or may not be an or­di­nal, so it’s not nec­es­sar­ily well-defined within the or­di­nals.

If we’re us­ing or­di­nary ad­di­tion, we have to dis­t­in­guish be­tween X-Y and -Y+X. (The lat­ter just be­ing a way of de­not­ing “sub­tract­ing on the left”; it should not be in­ter­preted as ac­tu­ally negat­ing Y and adding to X.) -Y+X will have a unique value so long as Y≤X, but X-Y is a differ­ent story; even re­strict­ing to Y≤X, if X is in­finite, then X-Y may have mul­ti­ple pos­si­ble val­ues or none.

For ex­am­ple, if X rep­re­sents the length of a se­quence and Y is all the el­e­ments in the se­quences ex­cept one, then X—Y = 1.

Yeah, not go­ing to try to pick this apart, in short though this is non­sense.

I’m start­ing to think though that maybe you meant that X and Y were in­finite sets, rather than some sort of num­bers? With X-Y be­ing the set differ­ence? But that is not what you said. Sim­ply put you seem very con­fused about all this.

We can perform the calcu­la­tion in the sur­re­als, or we can perform it in the car­di­nals and re­ceive a broad range of pos­si­ble an­swers.

1. Are X and Y sur­re­als or are they car­di­nals? Sur­re­als and car­di­nals don’t mix, dude! It can’t be both, not un­less they’re just whole num­bers! You are perform­ing the calcu­la­tion in what­ever num­ber sys­tem these things live in.

2. You just said above you get a well-defined an­swer, and, more­over, that it’s 1! Now you’re tel­ling me that you can get a broad range of pos­si­ble an­swers??

3. If X is rep­re­sent­ing the length of a se­quence, it should prob­a­bly be an or­di­nal. As for Y… yeah, OK, not go­ing to try to make sense of the thing I already said I wouldn’t at­tempt to pick through.

4. And if X and Y are sets rather than num­bers… oh, to hell with it, I’m just go­ing to move on.

But for ev­ery pos­si­ble an­swer in the car­di­nals, we can find pairs of sur­real num­bers that would provide that an­swer.

There is, I think, a cor­rect idea here that is res­cuable. It also seems pretty clear you don’t know enough to perform that res­cue your­self and rephrase this as some­thing that makes sense. (A hint, though: The fixed ver­sion prob­a­bly should not in­volve sur­re­als.)

(Do sur­real num­bers even have car­di­nal­ities, in a mean­ingful sense? Yes ob­vi­ously if you pick a par­tic­u­lar way of rep­re­sent­ing sur­re­als as sets, e.g. by rep­re­sent­ing them as sign se­quences, the re­sult­ing rep­re­sen­ta­tions will have car­di­nal­ities; ob­vi­ously, that’s not what I’m talk­ing about. Although, who knows, maybe that’s a work­able no­tion—define the car­di­nal­ity of a sur­real to be the car­di­nal­ity of its birth­day. No idea if that’s ac­tu­ally rele­vant to any­thing, though.)

Even char­i­ta­bly in­ter­preted, none of this matches up with your com­ments above about equiv­alence classes. It re­lates, sure, but it doesn’t match. What you said above was that you could solve more equa­tions by pass­ing to equiv­alence classes. What you’re say­ing now seems to be… not that.

Long story short: I re­ally, re­ally, do not think you have much idea what you are talk­ing about. You re­ally need to re­learn this from scratch, and not start­ing with sur­re­als. I definitely do not think you are pre­pared to go in­struct­ing oth­ers on their uses; at this point I’m not con­vinced you could clearly ar­tic­u­late what or­di­nals and car­di­nals are for, you’ve got­ten ev­ery­thing so mixed up in your com­ment above. I wouldn’t recom­mend try­ing to ex­pand this into a post.

I think I should prob­a­bly stop ar­gu­ing here. If you re­ply to this with more bab­ble I’m not go­ing to waste my time re­ply­ing to it fur­ther.

• I re­ally ap­pre­ci­ate the time you’ve put into writ­ing these long re­sponses and I’ll ad­mit that there are some gaps in my un­der­stand­ing, but I don’t think you’ve un­der­stood what I was say­ing at all. I sus­pect that this is a haz­ard with pro­duc­ing in­for­mal overviews of ideas + illu­sion of trans­parency. One ex­am­ple, when I said equiv­alence classes, I re­ally meant some­thing like equiv­alence classes. Any­way, in re­gards to all the points you’ve raised it’d take a lot of space to re­spond to them all, so I think I’ll just add a link to my post when I get time to write it.