Sup­pose that hu­man be­ings had ab­solutely no idea how they performed ar­ith­metic. Imag­ine that hu­man be­ings had evolved, rather than hav­ing learned, the abil­ity to count sheep and add sheep. Peo­ple us­ing this built-in abil­ity have no idea how it worked, the way Aris­to­tle had no idea how his vi­sual cor­tex sup­ported his abil­ity to see things. Peano Arith­metic as we know it has not been in­vented. There are philoso­phers work­ing to for­mal­ize nu­mer­i­cal in­tu­itions, but they em­ploy no­ta­tions such as

Plus-Of(Seven, Six) = Thirteen

to for­mal­ize the in­tu­itively ob­vi­ous fact that when you add “seven” plus “six”, of course you get “thir­teen”.

In this world, pocket calcu­la­tors work by stor­ing a gi­ant lookup table of ar­ith­meti­cal facts, en­tered man­u­ally by a team of ex­pert Ar­tifi­cial Arith­meti­ci­ans, for start­ing val­ues that range be­tween zero and one hun­dred. While these calcu­la­tors may be helpful in a prag­matic sense, many philoso­phers ar­gue that they’re only simu­lat­ing ad­di­tion, rather than re­ally adding. No ma­chine can re­ally count—that’s why hu­mans have to count thir­teen sheep be­fore typ­ing “thir­teen” into the calcu­la­tor. Calcu­la­tors can re­cite back stored facts, but they can never know what the state­ments mean—if you type in “two hun­dred plus two hun­dred” the calcu­la­tor says “Er­ror: Ou­trange”, when it’s in­tu­itively ob­vi­ous, if you know what the words mean, that the an­swer is “four hun­dred”.

Philoso­phers, of course, are not so naive as to be taken in by these in­tu­itions. Num­bers are re­ally a purely for­mal sys­tem—the la­bel “thirty-seven” is mean­ingful, not be­cause of any in­her­ent prop­erty of the words them­selves, but be­cause the la­bel refers to thirty-seven sheep in the ex­ter­nal world. A num­ber is given this refer­en­tial prop­erty by its se­man­tic net­work of re­la­tions to other num­bers. That’s why, in com­puter pro­grams, the LISP to­ken for “thirty-seven” doesn’t need any in­ter­nal struc­ture—it’s only mean­ingful be­cause of refer­ence and re­la­tion, not some com­pu­ta­tional prop­erty of “thirty-seven” it­self.

No one has ever de­vel­oped an Ar­tifi­cial Gen­eral Arith­meti­cian, though of course there are plenty of do­main-spe­cific, nar­row Ar­tifi­cial Arith­meti­ci­ans that work on num­bers be­tween “twenty” and “thirty”, and so on. And if you look at how slow progress has been on num­bers in the range of “two hun­dred”, then it be­comes clear that we’re not go­ing to get Ar­tifi­cial Gen­eral Arith­metic any time soon. The best ex­perts in the field es­ti­mate it will be at least a hun­dred years be­fore calcu­la­tors can add as well as a hu­man twelve-year-old.

But not ev­ery­one agrees with this es­ti­mate, or with merely con­ven­tional be­liefs about Ar­tifi­cial Arith­metic. It’s com­mon to hear state­ments such as the fol­low­ing:

• “It’s a fram­ing prob­lem—what ‘twenty-one plus’ equals de­pends on whether it’s ‘plus three’ or ‘plus four’. If we can just get enough ar­ith­meti­cal facts stored to cover the com­mon-sense truths that ev­ery­one knows, we’ll start to see real ad­di­tion in the net­work.”

• “But you’ll never be able to pro­gram in that many ar­ith­meti­cal facts by hiring ex­perts to en­ter them man­u­ally. What we need is an Ar­tifi­cial Arith­meti­cian that can learn the vast net­work of re­la­tions be­tween num­bers that hu­mans ac­quire dur­ing their child­hood by ob­serv­ing sets of ap­ples.”

• “No, what we re­ally need is an Ar­tifi­cial Arith­meti­cian that can un­der­stand nat­u­ral lan­guage, so that in­stead of hav­ing to be ex­plic­itly told that twenty-one plus six­teen equals thirty-seven, it can get the knowl­edge by ex­plor­ing the Web.”

• “Frankly, it seems to me that you’re just try­ing to con­vince your­selves that you can solve the prob­lem. None of you re­ally know what ar­ith­metic is, so you’re flounder­ing around with these generic sorts of ar­gu­ments. ‘We need an AA that can learn X’, ‘We need an AA that can ex­tract X from the In­ter­net’. I mean, it sounds good, it sounds like you’re mak­ing progress, and it’s even good for pub­lic re­la­tions, be­cause ev­ery­one thinks they un­der­stand the pro­posed solu­tion—but it doesn’t re­ally get you any closer to gen­eral ad­di­tion, as op­posed to do­main-spe­cific ad­di­tion. Prob­a­bly we will never know the fun­da­men­tal na­ture of ar­ith­metic. The prob­lem is just too hard for hu­mans to solve.”

• “That’s why we need to de­velop a gen­eral ar­ith­meti­cian the same way Na­ture did—evolu­tion.”

• “Top-down ap­proaches have clearly failed to pro­duce ar­ith­metic. We need a bot­tom-up ap­proach, some way to make ar­ith­metic emerge. We have to ac­knowl­edge the ba­sic un­pre­dictabil­ity of com­plex sys­tems.”

• “You’re all wrong. Past efforts to cre­ate ma­chine ar­ith­metic were fu­tile from the start, be­cause they just didn’t have enough com­put­ing power. If you look at how many trillions of synapses there are in the hu­man brain, it’s clear that calcu­la­tors don’t have lookup ta­bles any­where near that large. We need calcu­la­tors as pow­er­ful as a hu­man brain. Ac­cord­ing to Moore’s Law, this will oc­cur in the year 2031 on April 27 be­tween 4:00 and 4:30 in the morn­ing.”

• “I be­lieve that ma­chine ar­ith­metic will be de­vel­oped when re­searchers scan each neu­ron of a com­plete hu­man brain into a com­puter, so that we can simu­late the biolog­i­cal cir­cuitry that performs ad­di­tion in hu­mans.”

• “I don’t think we have to wait to scan a whole brain. Neu­ral net­works are just like the hu­man brain, and you can train them to do things with­out know­ing how they do them. We’ll cre­ate pro­grams that will do ar­ith­metic with­out we, our cre­ators, ever un­der­stand­ing how they do ar­ith­metic.”

• “But Gödel’s The­o­rem shows that no for­mal sys­tem can ever cap­ture the ba­sic prop­er­ties of ar­ith­metic. Clas­si­cal physics is for­mal­iz­able, so to add two and two, the brain must take ad­van­tage of quan­tum physics.”

• “Hey, if hu­man ar­ith­metic were sim­ple enough that we could re­pro­duce it in a com­puter, we wouldn’t be able to count high enough to build com­put­ers.”

• “Haven’t you heard of John Searle’s Chi­nese Calcu­la­tor Ex­per­i­ment? Even if you did have a huge set of rules that would let you add ‘twenty-one’ and ‘six­teen’, just imag­ine trans­lat­ing all the words into Chi­nese, and you can see that there’s no gen­uine ad­di­tion go­ing on. There are no real num­bers any­where in the sys­tem, just la­bels that hu­mans use for num­bers...”

There is more than one moral to this parable, and I have told it with differ­ent morals in differ­ent con­texts. It illus­trates the idea of lev­els of or­ga­ni­za­tion, for ex­am­ple—a CPU can add two large num­bers be­cause the num­bers aren’t black-box opaque ob­jects, they’re or­dered struc­tures of 32 bits.

But for pur­poses of over­com­ing bias, let us draw two morals:

• First, the dan­ger of be­liev­ing as­ser­tions you can’t re­gen­er­ate from your own knowl­edge.

• Se­cond, the dan­ger of try­ing to dance around ba­sic con­fu­sions.

Lest any­one ac­cuse me of gen­er­al­iz­ing from fic­tional ev­i­dence, both les­sons may be drawn from the real his­tory of Ar­tifi­cial In­tel­li­gence as well.

The first dan­ger is the ob­ject-level prob­lem that the AA de­vices ran into: they func­tioned as tape recorders play­ing back “knowl­edge” gen­er­ated from out­side the sys­tem, us­ing a pro­cess they couldn’t cap­ture in­ter­nally. A hu­man could tell the AA de­vice that “twenty-one plus six­teen equals thirty-seven”, and the AA de­vices could record this sen­tence and play it back, or even pat­tern-match “twenty-one plus six­teen” to out­put “thirty-seven!”, but the AA de­vices couldn’t gen­er­ate such knowl­edge for them­selves.

Which is strongly rem­i­nis­cent of be­liev­ing a physi­cist who tells you “Light is waves”, record­ing the fas­ci­nat­ing words and play­ing them back when some­one asks “What is light made of?”, with­out be­ing able to gen­er­ate the knowl­edge for your­self. More on this theme to­mor­row.

The sec­ond moral is the meta-level dan­ger that con­sumed the Ar­tifi­cial Arith­metic re­searchers and opinionated by­stan­ders—the dan­ger of danc­ing around con­fus­ing gaps in your knowl­edge. The ten­dency to do just about any­thing ex­cept grit your teeth and buckle down and fill in the damn gap.

Whether you say, “It is emer­gent!”, or whether you say, “It is un­know­able!”, in nei­ther case are you ac­knowl­edg­ing that there is a ba­sic in­sight re­quired which is pos­sess­able, but un­pos­sessed by you.

How can you know when you’ll have a new ba­sic in­sight? And there’s no way to get one ex­cept by bang­ing your head against the prob­lem, learn­ing ev­ery­thing you can about it, study­ing it from as many an­gles as pos­si­ble, per­haps for years. It’s not a pur­suit that academia is set up to per­mit, when you need to pub­lish at least one pa­per per month. It’s cer­tainly not some­thing that ven­ture cap­i­tal­ists will fund. You want to ei­ther go ahead and build the sys­tem now, or give up and do some­thing else in­stead.

Look at the com­ments above: none are aimed at set­ting out on a quest for the miss­ing in­sight which would make num­bers no longer mys­te­ri­ous, make “twenty-seven” more than a black box. None of the com­menters re­al­ized that their difficul­ties arose from ig­no­rance or con­fu­sion in their own minds, rather than an in­her­ent prop­erty of ar­ith­metic. They were not try­ing to achieve a state where the con­fus­ing thing ceased to be con­fus­ing.

If you read Judea Pearl’s “Prob­a­bil­is­tic Rea­son­ing in In­tel­li­gent Sys­tems: Net­works of Plau­si­ble In­fer­ence” then you will see that the ba­sic in­sight be­hind graph­i­cal mod­els is in­dis­pens­able to prob­lems that re­quire it. (It’s not some­thing that fits on a T-Shirt, I’m afraid, so you’ll have to go and read the book your­self. I haven’t seen any on­line pop­u­lariza­tions of Bayesian net­works that ad­e­quately con­vey the rea­sons be­hind the prin­ci­ples, or the im­por­tance of the math be­ing ex­actly the way it is, but Pearl’s book is won­der­ful.) There were once dozens of “non-mono­tonic log­ics” awk­wardly try­ing to cap­ture in­tu­itions such as “If my bur­glar alarm goes off, there was prob­a­bly a bur­glar, but if I then learn that there was a small earth­quake near my home, there was prob­a­bly not a bur­glar.” With the graph­i­cal-model in­sight in hand, you can give a math­e­mat­i­cal ex­pla­na­tion of ex­actly why first-or­der logic has the wrong prop­er­ties for the job, and ex­press the cor­rect solu­tion in a com­pact way that cap­tures all the com­mon-sense de­tails in one el­e­gant swoop. Un­til you have that in­sight, you’ll go on patch­ing the logic here, patch­ing it there, adding more and more hacks to force it into cor­re­spon­dence with ev­ery­thing that seems “ob­vi­ously true”.

You won’t know the Ar­tifi­cial Arith­metic prob­lem is un­solv­able with­out its key. If you don’t know the rules, you don’t know the rule that says you need to know the rules to do any­thing. And so there will be all sorts of clever ideas that seem like they might work, like build­ing an Ar­tifi­cial Arith­meti­cian that can read nat­u­ral lan­guage and down­load mil­lions of ar­ith­meti­cal as­ser­tions from the In­ter­net.

And yet some­how the clever ideas never work. Some­how it always turns out that you “couldn’t see any rea­son it wouldn’t work” be­cause you were ig­no­rant of the ob­sta­cles, not be­cause no ob­sta­cles ex­isted. Like shoot­ing blind­folded at a dis­tant tar­get—you can fire blind shot af­ter blind shot, cry­ing, “You can’t prove to me that I won’t hit the cen­ter!” But un­til you take off the blind­fold, you’re not even in the aiming game. When “no one can prove to you” that your pre­cious idea isn’t right, it means you don’t have enough in­for­ma­tion to strike a small tar­get in a vast an­swer space. Un­til you know your idea will work, it won’t.

From the his­tory of pre­vi­ous key in­sights in Ar­tifi­cial In­tel­li­gence, and the grand messes which were pro­posed prior to those in­sights, I de­rive an im­por­tant real-life les­son: When the ba­sic prob­lem is your ig­no­rance, clever strate­gies for by­pass­ing your ig­no­rance lead to shoot­ing your­self in the foot.

• Well, shoot­ing ran­domly at a dis­tant tar­get is more likely to pro­duce a bulls-eye than not shoot­ing at all, even though you’re al­most cer­tainly go­ing to miss (and prob­a­bly shoot your­self in the foot while you’re at it). It’s prob­a­bly bet­ter to try to find a way to take off that blind­fold. As you sug­gest, we don’t yet un­der­stand in­tel­li­gence, so there’s no way we’re go­ing to make an in­tel­li­gent ma­chine with­out ei­ther sig­nifi­cantly im­prov­ing our un­der­stand­ing or win­ning the prover­bial lot­tery.

“Pro­gram­ming is the art of figur­ing out what you want so pre­cisely that even a ma­chine can do it.”—Some guy who isn’t famous

• Well shoot­ing ran­domly is per­haps a bad idea, but I think the best we can do is shoot sys­tem­at­i­cally, which is hardly bet­ter (takes ex­po­nen­tially many bul­lets). So you ei­ther have to be lucky, or hope the tar­get isn’t very far, so you don’t need to a wide cone to take pot shots at, or hope P=NP.

• @Doug & Gray: AGI is a William Tell tar­get. A near miss could be very un­for­tu­nate. We can’t re­spon­si­bly take a proper shot till we have an ap­pro­pri­ate level of un­der­stand­ing and con­fi­dence of ac­cu­racy.

• A near miss could be very un­for­tu­nate.

This keeps on com­ing up, is there some­where this is ex­plained in de­tail? Also, have pos­si­ble solu­tions been looked at such as con­struct­ing the AI in a con­trol­led en­vi­ron­ment? If so why wouldn’t any of them work work?

Thanks to who­ever re­sponds.

• Eliezer,

Did you in­clude your own an­swer to the ques­tion of why AI hasn’t ar­rived yet in the list? :-)

This is a nice post. Another way of stat­ing the moral might be: “If you want to un­der­stand some­thing, you have to stare your con­fu­sion right in the face; don’t look away for a sec­ond.”

So, what is con­fus­ing about in­tel­li­gence? That ques­tion is prob­le­matic: a bet­ter one might be “what isn’t con­fus­ing about in­tel­li­gence?”

Here’s one thing I’ve pon­dered at some length. The VC the­ory states that in or­der to gen­er­al­ize well a learn­ing ma­chine must im­ple­ment some form of ca­pac­ity con­trol or reg­u­lariza­tion, which roughly means that the model class it uses must have limited com­plex­ity (VC di­men­sion). This is just Oc­cam’s ra­zor.

But the brain has on the or­der of 10^12 synapses, and so it must be enor­mously com­plex. How can the brain gen­er­al­ize, if it has so many pa­ram­e­ters? Are the vast ma­jor­ity of synap­tic weights ac­tu­ally not learned, but rather pre­set some­how? Or, is reg­u­lariza­tion im­ple­mented in some other way, per­haps by ap­ply­ing ran­dom changes to the value of the weights (this would seem bio­chem­i­cally plau­si­ble)?

Also, the brain has a very high metabolic cost, so all those neu­rons must be do­ing some­thing valuable.

• Are the vast ma­jor­ity of synap­tic weights ac­tu­ally not learned, but rather pre­set some­how?”

This is what some philoso­phers have pur­posed, oth­ers have thought we start as a blank slate. The re­search into the sub­ject has shown that ba­bies do start with some sort of work­ing model of things. That is we be­gin life with a set of pre­set prefer­ences and the abil­ity to dis­t­in­guish those prefer­ences and a ba­sic un­der­stand­ing of ge­o­met­ric shapes.

• It would be shock­ing if we didn’t have pre­set func­tions. Calves, for ex­am­ple, can walk al­most straight away and swim not much longer. We aren’t go­ing to en­tirely elimi­nate the mam­malian abil­ity to start with a set of pre­set fea­tures there just isn’t enough pres­sure to keep a few of them.

• If you put a new­born whose mother had an un­med­i­cated la­bor on the mother’s stom­ach, the baby will move up to a breast and start to feed.

• Con­versely, stud­ies with new­born mam­mals have shown that if you de­prive them of some­thing as sim­ple as hori­zon­tal lines, they will grow up un­able to dis­t­in­guish lines that ap­proach ‘hori­zon­tal­ness’. So even sep­a­rat­ing the most ba­sic evolved be­hav­ior from the most ba­sic learned be­hav­ior is not in­tu­itive.

• The de­pri­va­tion you’re talk­ing about takes place over the course of days and weeks—it re­flects the effects of (lack of) re­in­force­ment learn­ing, so it’s not re­ally ger­mane to a dis­cus­sion of pre­set func­tions that man­i­fest in the first few min­utes af­ter birth.

• It’s rele­vant in­so­far as we shouldn’t make as­sump­tions on what is and is not pre­set sim­ply based on ob­ser­va­tions that take place in a “typ­i­cal” en­vi­ron­ment.

• Ah, a nega­tive ex­am­ple. Fair point. Guess I wasn’t pay­ing enough at­ten­tion and missed the sig­nal you meant to send by us­ing “con­versely” as the first word of your com­ment.

• That was lazy of me, in ret­ro­spect. I find that of­ten I’m poorer at com­mu­ni­cat­ing my in­tent than I as­sume I am.

• Good point. Drink (food), breathe, scream and a cou­ple of cute re­ac­tions to keep care­tak­ers in­ter­ested. All you need to boot­strap a hu­man growth pro­cess. There seems to be some­thing built in about eye con­tact man­age­ment too—be­cause a lack there is an early in­di­ca­tor that some­thing is wrong.

• a cou­ple of cute re­ac­tions to keep care­tak­ers interested

Not ter­ribly rele­vant to your point, but it’s likely hu­man sense of cute­ness is based on what ba­bies do rather than the other way around.

• I’d re­place “hu­man” with “mam­malian”—most young mam­mals share a similar set of traits, even those that aren’t con­strained as we are by big brains and a pelvic gir­dle adapted to walk­ing up­right. That seems to sug­gest a more basal cute­ness re­sponse; I be­lieve the biol­ogy term is “baby schema”.

Other than that, yeah.

• Ar­tifi­cial Neu­ral Net­works have been trained with mil­lions of pa­ram­e­ters. There are a lot of differ­ent meth­ods of reg­u­lariza­tion like drop­con­nect or spar­sity con­straints. But the brain does on­line learn­ing. Overfit­ting isn’t as big of a con­cern be­cause it doesn’t see the data more than once.

• On the other hand, ar­chi­tec­ture mat­ters. The most suc­cess­ful neu­ral net­work for a given task has con­nec­tions de­signed for the struc­ture of that task, so that it will learn much more quickly than a fully-con­nected or ar­bi­trar­ily con­nected net­work.

The hu­man brain ap­pears to have a great deal of in­for­ma­tion and struc­ture in its ar­chi­tec­ture right off the bat.

• I’m not say­ing that you’re wrong, but the state of the art in com­puter vi­sion is weight shar­ing which biolog­i­cal NNs prob­a­bly can’t do. Hyper pa­ram­e­ters like the num­ber of lay­ers and how lo­cal the con­nec­tions should be, are im­por­tant but they don’t give that much prior in­for­ma­tion about the task.

I may be com­pletely wrong, but I do sus­pect that biolog­i­cal NNs are far more gen­eral pur­pose and less “pre-pro­grammed” than is usu­ally thought. The learn­ing rules for a neu­ral net­work are far sim­pler than the func­tions they learn. Train­ing neu­ral net­works with ge­netic al­gorithms is ex­tremely slow.

• Ar­chi­tec­ture of the V1 and V2 ar­eas of the brain, which Con­volu­tional Neu­ral Net­works and other ANNs for vi­sion bor­row heav­ily from, is highly geared to­wards vi­sion, and in­cludes ba­sic filters that de­tect stripes, dots, cor­ners, etc. that ap­pear in all sorts of com­puter vi­sion work. Yes, no back­prop­a­ga­tion or weight-shar­ing is di­rectly re­spon­si­ble for this, but the pres­ence of lo­cal filters is still what I would call very spe­cific ar­chi­tec­ture (I’ve stud­ied com­puter vi­sion and in­spira­tion it draws from early vi­sion speci­fi­cally, so I can say more about this).

The way ge­netic al­gorithms tune weights in an ANN (and yes, this is an awful way to train an ANN) is very differ­ent from the way they work in ac­tu­ally evolv­ing a brain; work­ing on the ge­netic code that de­vel­ops the brain. I’d say they are so wildly differ­ent that no con­clu­sions from the first can be ap­plied to the sec­ond.

Dur­ing a sin­gle in­di­vi­d­ual’s life, Heb­bian and other learn­ing mechanisms in the brain are dis­tinct from gra­di­ent learn­ing, but can achieve some­what similar things.

• The hu­man brain ap­pears to en­gage in hi­er­ar­chi­cal learn­ing, which is what al­lows it to lev­er­age huge amounts of “gen­eral case” ab­stract knowl­edge in at­tack­ing novel spe­cific prob­lems put be­fore it.

• That’s not how William Tell man­aged it. He had to prac­tice aiming at less-dan­ger­ous tar­gets un­til he be­came an ex­pert, and only then did he at­tempt to shoot the ap­ple.

It is not clear to me that it is de­sir­able to pre­judge what an ar­tifi­cial in­tel­li­gence should de­sire or con­clude, or even pos­si­ble to pur­pose­fully put real con­straints on it in the first place. We should sim­ply cre­ate the god, then ac­knowl­edge the truth: that we aren’t ca­pa­ble of eval­u­at­ing the think­ing of gods.

• It is not clear to me that it is de­sir­able to pre­judge what an ar­tifi­cial in­tel­li­gence should de­sire or con­clude,

But it shouldn’t con­clude that throw­ing large as­ter­oids at Yel­low­stone is a good idea, nor de­sire to do it. If you fol­low this strat­egy, you’ll doom us. Sim­ple as that.

• Ad­ding to DanBurFoot, is there a link you want to point to that shows your real, tan­gible re­sults for AI, based on your su­pe­rior method­ol­ogy?

• I think that one of the difficul­ties in­her­ent in monotonous log­ics comes from the fact that real num­bers are not very good a rep­re­sent­ing things con­tin­u­ous. In or­der to define a sin­gle point, an in­finite num­ber of digits are needed and thus an in­finite amount of in­for­ma­tion. Often math­e­mat­i­ci­ans ig­nore this. To them, us­ing the sym­bol 2 to rep­re­sent a con­tin­u­ous quan­tity is the same as the sym­bol 2.000… which seem to make for all kinds of weird para­doxes caused by the use of, of­ten im­plied, in­finite digits. For ex­am­ple, lo­gi­ci­ans seem to be un­able to make a dis­tinc­tion be­tween 1.999… and 2 (where they take two as mean­ing 2.000...) thus two differ­ent defin­able real num­bers rep­re­sent the same point.

When us­ing real num­bers that rep­re­sent con­tin­u­ous value, I of­ten won­der if we shouldn’t always be us­ing the num­ber of digits to rep­re­sent some kind of un­cer­tainty. Us­ing sig­nifi­cant digits, is one of the first thing stu­dents learn in uni­ver­sity, they are cru­cial for ex­per­i­ments of the real world, they al­low us to quan­tify the un­cer­tainty in the digits we write down. Yet math­e­mat­i­ci­ans and lo­gi­ci­ans seem to ig­nore them in fa­vor of para­dox­i­cal in­fini­ties. I won­der if by us­ing un­cer­tainty in this way, we might not do away with Godel’s the­o­rem and define ar­ith­metics within a cer­tain amount of rel­a­tive un­cer­tainty in­her­ent to our mea­sur­ing in­stru­ments and rea­son­ing ma­chin­ery.

• For what it’s worth, Benoit Es­si­am­bre, the things you have just said are non­sense. The rea­son lo­gi­ci­ans seem to be un­able to make a dis­tinc­tion be­tween 1.999… and 2 is that there is no dis­tinc­tion. They are not two differ­ent defin­able real num­bers, they are the same defin­able real num­ber.

• Ex­cept that 1.9999… < 2

Edit: here’s the proof that I’m wrong math­e­mat­i­cally (from the pro­vided Wikipe­dia link): “Mul­ti­pli­ca­tion of 9 times 1 pro­duces 9 in each digit, so 9 × 0.111… equals 0.999… and 9 × 1⁄9 equals 1, so 0.999… = 1”

• The eas­iest ex­am­ple I’ve come across is:

If (1 ÷ 3 = 0.333...) and (0.999… ÷ 3 = 0.333...) then (1 = 0.999...).

• Ok. In­ter­est­ing.

I can see and agree that 0.999… can in the limit equal two, whereas in any finite rep­re­sen­ta­tion would still be less than 2.

I don’t con­sider them to be “the same num­ber” in that sense… even though they alge­braically equate (once the limit is reached) in a the­o­ret­i­cal frame­work that can en­com­pass in­fini­ties.

ie, in maths, I’d equate them but in the “real world”—I’d treat them sep­a­rately.

Edit: and read­ing fur­ther… it seems I’m wrong again. Of course, the whole point of putting ”...” is to rep­re­sent the fact that this is the limit of the dec­i­mal ex­pan­sion of 0.999… to in­finity.

there­fore yep, 1.999… = 2

Where my un­der­stand­ing failed me is that 1.999… does not in fact rep­re­sent the sum­ma­tion of the in­finite set of 1 + 0.9 + 0.09 + … which sum­ma­tion could, in fact, sim­ply not be taken to its full limit. The rep­re­sen­ta­tion “1.999...” can only rep­re­sent ei­ther the set or the limit of the set, and math­e­mat­i­cal con­ven­tion has it as the lat­ter, not the former.

• Another ar­gu­ment that may be more con­vinc­ing on a gut level:

9x(1/​9) is ex­actly equal to 1, cor­rect?

Find the dec­i­mal rep­re­sen­ta­tion of 19 us­ing long di­vi­sion: 1/​9=0.11111111… (note there is no differ­ent or su­pe­rior way to rep­re­sent this num­ber as a dec­i­mal)

9x(1/​9) = 9x(0.11111111...)=0.9999999… which we already agreed was ex­actly equal to 1.

• Yes :)

See my pre­vi­ous (ed­ited) com­ment above.

• No prob­lem. It is a great proof (there aren’t many so sim­ple and suc­cinct). Just bad luck on timing ;)

• Note also that it has to de­note the limit, be­cause we want it to de­note a num­ber, and the other ob­ject you de­scribe (a se­quence rather than a set, strictly speak­ing) isn’t a num­ber, just, well, a se­quence of num­bers.

• it has to de­note the limit, be­cause we want it to de­note a num­ber,

This is the part I take is­sue with.

It does not have to de­note a num­ber, but we choose to let it de­note a num­ber (rather than a se­quence) be­cause that is how math­e­mat­i­ci­ans find it most con­ve­nient to use that par­tic­u­lar rep­re­sen­ta­tion.

That se­quence is also quite use­ful math­e­mat­i­cally—just not as use­ful as the num­ber-that-rep­re­sents-the-limit. Many se­quences are con­sid­ered to be use­ful… though gen­er­ally not in alge­bra—it’s more com­mon in Calcu­lus, where such se­quences are ex­tremely use­ful. In fact I’d say that in calcu­lus “just a se­quence” is per­haps even more use­ful than “just a num­ber”.

My first im­pres­sion (and thus what I origi­nally got wrong) was that 1.999… rep­re­sented the se­quence and not the limit be­cause, re­ally, if you meant 2, why not just say 2? :)

• If we wanted to talk about the se­quence we would never de­note it 1.999… We would write {1, 1.9, 1.99, 1.999, …} and per­haps give the for­mula for the Nth term, which is 2 − 10^-N.

• Hi Misha, I might also turn that ar­gu­ment back on you and re­peat what I said be­fore: “if you meant 2, why not just say 2?” It’s as valid as “if you meant the se­quence, why not just write {1, 1.9, 1.99, 1.999, …}”?

Clearly there are other rea­sons for us­ing some­thing that is not the usual con­ven­tion. There are definitely good rea­sons for rep­re­sent­ing in­finite se­ries or se­quences… as you have pointed out. How­ever—there is no par­tic­u­lar rea­son why math­e­mat­ics has cho­sen to use 1.999… to mean the limit, as op­posed to the ac­tual in­finite se­ries. Either one could be equally val­idly used in this situ­a­tion.

It is only by com­mon con­ven­tion that math­e­mat­ics uses it to rep­re­sent the ac­tual limit (as n tends to in­finity) in­stead of the other pos­si­bil­ity—which would be “the ac­tual limit as n tends to in­finity… if we ac­tu­ally take it to in­finity, or an in­finites­i­mal less than the limit if we don’t”, which is how I as­sumed (in­cor­rectly) that it was to be used

How­ever, the other thing you say that “we never de­note it 1.999...” pulls out an in­ter­est­ing though, and if I grasp what you’re say­ing cor­rectly, then I dis­agree with you.

As I’ve men­tioned in an­other com­ment now—math­e­mat­i­cal sym­bolic con­ven­tions are the same as “words”—they are map, not ter­ri­tory. We define them to mean what we want them to mean. We choose what they mean by com­mon con­sen­sus (mo­ti­vated by con­ve­nience). It is a very good idea to fol­low that con­ven­tion—which is why I de­cided I was wrong to use it the way I origi­nally as­sumed it was be­ing used… and from now on, I will use the usual con­ven­tion...

How­ever, you seem to be say­ing that you think the cur­rent way is “the one true way” and that the other way is not valid at all… ie that “we would never de­note it 1.9999...” as be­ing some sort of ba­sis of fact out there in re­al­ity, when re­ally it’s just a con­ven­tion that we’ve cho­sen, and is there­fore non-ob­vi­ous from look­ing at the sym­bol with­out the prior knowl­edge of the con­ven­tion (as I did).

I am try­ing to ex­plain that this is not the case—with­out know­ing the con­ven­tion, ei­ther mean­ing is valid… it’s only hav­ing now been shown the con­ven­tion that I now know what is gen­er­ally “by defi­ni­tion” meant by the sym­bol, and it hap­pened to be a differ­ent way to what I au­to­mat­i­cally picked. with­out prior knowl­edge.

so yes, I think we would never de­note the se­quence as 1.999… but not be­cause the se­quence is not rep­re­sentable by 1.999… - sim­ply be­cause it is con­ven­tional to do so.

• You have a point. I tend to dis­like ar­gu­ments about math­e­mat­ics that start with “well, this defi­ni­tion is just a choice” be­cause they don’t cap­ture any sub­stance about any ac­tual math. As a re­sult, I tried to head that off by (per­haps poorly) mak­ing a case for why this defi­ni­tion is a rea­son­able choice.

In any case, I mi­s­un­der­stood the na­ture of what you were say­ing about the con­ven­tion, so I don’t think we’re in any ac­tual dis­agree­ment.

I might also turn that ar­gu­ment back on you and re­peat what I said be­fore: “if you meant 2, why not just say 2?”

If I meant 2, I would say 2. How­ever, our sys­tem of writ­ing re­peat­ing dec­i­mals also al­lows us to (re­dun­dantly) write the re­peat­ing dec­i­mal 1.999… which is equiv­a­lent to 2. It’s not a very use­ful re­peat­ing dec­i­mal, but it some­times comes out as a re­sult of an al­gorithm: e.g. when you mul­ti­ply 29 = 0.222… by 9, you will get 1.999… as you calcu­late it, in­stead of get­ting 2 straight off the bat.

• You have a point. I tend to dis­like ar­gu­ments about math­e­mat­ics that start with “well, this defi­ni­tion is just a choice”

Me too! Espe­cially as I’ve just been read­ing that se­quence here about “prov­ing by defi­ni­tion” and “I can define it any way I like”… that’s why I tried to make it very clear I wasn’t say­ing that… I also needed to head of the head­ing off ;)

Any­way—I be­lieve we are just in vi­o­lent agree­ment here, so no prob­lems ;)

• OK, let me put it this way: If we are con­sid­er­ing the ques­tion “Is 1.999...=2?”, the con­text makes it clear that we must be con­sid­er­ing the left hand side as a num­ber, be­cause the RHS is a num­ber. (Would you in­ter­pret 2 in that con­text as the con­stant 2 se­quence? Well then of course they’re not equal, but this is ob­vi­ous and un­en­light­en­ing.) Why would you com­pare a num­ber for equal­ity against a se­quence? They’re en­tirely differ­ent sorts of ob­jects.

• is “x-squared = 2” ? is a perfectly valid ques­tion to ask in math­e­mat­ics even though the LHS is not ob­vi­ously an number

In this case, it is a for­mula that can equate to a num­ber… just as the se­quence is a (very limited) for­mula that can equate to 2 - if we take the se­quence to its limit; or that falls just shy of 2 - if we try and rep­re­sent it in any finite/​limited way.

In stat­ing that 1.9999… is a num­ber, you are as­sum­ing the us­age of the limit/​num­ber, rather than the other po­ten­tial us­age ie, you are fal­ling into the same as­sump­tion-trap that I fell into… It’s just that your as­sump­tion hap­pens to be the one that matches with com­mon us­age, whereas mine wasn’t ;)

Us­ing 1.9999. to rep­re­sent the limit of the se­quence (ie the num­ber) is cer­tainly true by con­ven­tion (ie “by defi­ni­tion”), but is no means the only way to in­ter­pret the sym­bols. It could just as eas­ily rep­re­sent the se­quence it­self… we just don’t hap­pen to do that—we define what math­e­mat­i­cal sym­bols re­fer to… they’re just the word/​poin­t­ers to what we’re talk­ing about yes?

• is “x-squared = 2” ? is a perfectly valid ques­tion to ask in math­e­mat­ics even though the LHS is not ob­vi­ously an number

Er… yes it is? In that con­text, x^2 is a num­ber. We just don’t know what num­ber it might be. By con­trast, the se­quence (1, 1.9, 1.99, …) is not a num­ber at all.

Fur­ther­more, even if we in­sist on re­gard­ing x^2 as a for­mula with a free vari­able, your anal­ogy doesn’t hold. The se­quence (1, 1.9, 1.99, …) has no free vari­ables; it’s one spe­cific se­quence.

You are cor­rect that the con­ven­tion could have been that 1.999… rep­re­sents the se­quence… but as I stated be­fore, in that case, the ques­tion of whether it equals 2 would not be very mean­ingful. Given the con­text you can de­duce that we are us­ing the con­ven­tion that it des­ig­nates a num­ber.

• By con­trast, the se­quence (1, 1.9, 1.99, …) is not a num­ber at all

yes I agree, a se­quence is not a num­ber, it’s se­quence… though I won­der if we’re get­ting con­fused, be­cause we’re talk­ing about the se­quence, in­stead of the in­finite se­ries (1 + 0.9 + 0.09 +...) which is ac­tu­ally what I had in my head when I was first think­ing about 1.999...

Along the way, some­body said “se­quence” and that’s the word I started us­ing… when re­ally I’ve been think­ing about the in­finite se­ries.… anyway

The in­finite se­ries has far less free­dom than x^2, but that doesn’t mean that it’s a differ­ent thing en­tirely from x^2.

Lets con­sider “x − 1”

“x −1 ” is not a num­ber, un­til we equate it to some­thing that lets us de­ter­mine what x is…

If we use: “x −1 =4 ” how­ever. We can solve-for-x and there are no de­grees of free­dom.

If we use “1.9 < x −1 < 2” we have some minor de­gree of free­dom… and only just a few more than the in­finite se­ries in ques­tion.

Ad­mit­tedly, the only de­gree of free­dom left to 1.9999… (the se­ries) is to ei­ther be 2 or an in­finites­i­mal away from 2. But I don’t think that makes it differ­ent in kind to x −1 = 4

any­way—I think we’re prob­a­bly just in “vi­o­lent agree­ment” (as a friend of mine once used to say) ;)

All the bits that I was try­ing to re­ally say we agree over… now we’re just dis­cussing the re­lated maths ;)

the ques­tion of whether it equals 2 would not be very meaningful

Ok, lets move into hy­po­thet­i­cal land and pre­tend that 1.9999… rep­re­sents what I origi­nally though it rep­re­sents.

The com­par­i­son with the num­ber 2 pro­vides the mean­ing that what you want to do is to eval­u­ate the se­ries at its limit.

It’s to­tally sup­port­able for you to equate 1.9999… = 2 and de­ter­mine that this is a state­ment that is: 1) true when the in­finite se­ries has been eval­u­ated to the limit 2) false when it is rep­re­sented in any finite/​limited way

Edit: ah… that’s why you can’t use stars for to-the-power-of ;)

• any­way—I think we’re prob­a­bly just in “vi­o­lent agree­ment” (as a friend of mine once used to say) ;)

Er, no… there still seems to be quite a bit of con­fu­sion here...

All the bits that I was try­ing to re­ally say we agree over… now we’re just dis­cussing the re­lated maths ;)

Well, if you re­ally think that’s not sig­nifi­cant… :P

yes I agree, a se­quence is not a num­ber, it’s se­quence… though I won­der if we’re get­ting con­fused, be­cause we’re talk­ing about the se­quence, in­stead of the in­finite se­ries (1 + 0.9 + 0.09 +...) which is ac­tu­ally what I had in my head when I was first think­ing about 1.999...

Along the way, some­body said “se­quence” and that’s the word I started us­ing… when re­ally I’ve been think­ing about the in­finite se­ries.… anyway

It’s not clear to me what dis­tinc­tion you’re draw­ing here. A se­ries is a se­quence, just writ­ten differ­ently.

The in­finite se­ries has far less free­dom than x^2, but that doesn’t mean that it’s a differ­ent thing en­tirely from x^2.

It’s not at all clear to me what no­tion of “de­grees of free­dom” you’re us­ing here. The se­quence is an en­tirely differ­ent sort of thing than x^2, in that one is a se­quence, a com­plete math­e­mat­i­cal ob­ject, while the other is an ex­pres­sion with a free vari­able. If by “de­grees of free­dom” you mean some­thing like “free vari­ables”, then the se­quence has none. Now it’s true that, be­ing a se­quence of real num­bers, it is a func­tion from N to R, but there’s quite a differ­ence be­tween the ex­pres­sion 2-10^(-n), and the func­tion (i.e. se­quence) n |-> 2-10^(-n) ; yes, nor­mally we sim­ply write the lat­ter as the former when the mean­ing is un­der­stood, but un­der the hood they’re quite differ­ent. In a sense, func­tions are math­e­mat­i­cal, ex­pres­sions are meta­math­e­mat­i­cal.

When I say “x^2 is a num­ber”, what I mean is es­sen­tially, if we’re work­ing un­der a type sys­tem, then it has the type “real num­ber”. It’s an ex­pres­sion with one free vari­able, but it has type “real num­ber”. By con­trast, the func­tion x |-> x^2 has type “func­tion from re­als to re­als”, the se­quence (1, 1.9, 1.99, …) has type “se­quence of re­als”… (I re­al­ize that in stan­dard math­e­mat­ics we don’t ac­tu­ally tech­ni­cally work un­der a type sys­tem, but for prac­ti­cal pur­poses it’s a good way to think, and it’s I’m pretty sure it’s pos­si­ble to sen­si­bly for­mu­late things this way.) To equate a se­quence to a num­ber may tech­ni­cally in a sense re­turn “false”, but it’s bet­ter to think of it as re­turn­ing “type er­ror”. By con­trast, equat­ing x^2 to 2 - not equat­ing the func­tion x|->x^2 to 2, which is a type er­ror! - al­lows us to in­fer that x^2 is also a num­ber.

Ad­mit­tedly, the only de­gree of free­dom left to 1.9999… (the se­ries) is to ei­ther be 2 or an in­finites­i­mal away from 2. But I don’t think that makes it differ­ent in kind to x −1 = 4

Note, BTW, that the real num­bers don’t have any in­finites­i­mals (save for 0, if you count it).

It’s to­tally sup­port­able for you to equate 1.9999… = 2 and de­ter­mine that this is a state­ment that is: 1) true when the in­finite se­ries has been eval­u­ated to the limit 2) false when it is rep­re­sented in any finite/​limited way

Sorry, what does it even mean for it to be “rep­re­sented in a finite/​limited way”? The al­ter­na­tive to it be­ing a num­ber is it be­ing an in­finite se­quence, which is, well, in­finite.

I am re­ally get­ting the idea you should go read the stan­dard stuff on this and clear up any re­main­ing con­fu­sion that way, rather than try to ar­gue this here...

• Er, no… there still seems to be quite a bit of con­fu­sion here...

ah—then I apol­o­gise. I need to clar­ify. I see that there are sev­eral points where you’ve pointed out that I am us­ing math­e­mat­i­cal lan­guage in a sloppy fash­ion. How about I get those out of the way first.

that the real num­bers don’t have any infinitesimals

I should not have used the word “in­finites­i­mal”—as I re­ally meant “a very small num­ber” and was be­ing lazy. I am aware that “the the­ory of in­finites­i­mals” has an ac­tual math­e­mat­i­cal mean­ing… but this is not the way in which I was us­ing the word. I’ll ex­plain what I meant in a bit..

what does it even mean for it to be “rep­re­sented in a finite/​limited way”?

If I write a pro­gram that starts by adding 1 to 0.9 then I put it into a loop where it then adds “one tenth of the pre­vi­ous num­ber you just added”...

If at any point I tell the pro­gram “stop now and print out what you’ve got so far”… then what it will print out is some­thing that is “a very small num­ber” less than 2.

If I left the pro­gram run­ning for liter­ally an in­finite amount of time, it would even­tu­ally reach two. If I stop at any point at all (ie the pro­gram is finite), then it will re­turn a num­ber that a very small amount less than two.

In this way, the pro­gram has gen­er­ated a finite ap­prox­i­ma­tion of 1.999… that is != 2

As hu­mans, we can think about the prob­lem in a way that a stupid com­puter al­gorithm can­not, and can prove to our­selves that 1+(0.111.. * 9) ac­tu­ally == 2 ex­actly. but that is knowl­edge out­side of the pro­posed “finite” solu­tion/​sys­tem as de­scribed above.

Thus the two are differ­ent “rep­re­sen­ta­tions” of 1.999...

I am re­minded of the old en­g­ineer­ing adage that “3 is a good ap­prox­i­ma­tion of Pi for all prac­ti­cal pur­poses”—which tends to make some math­e­mat­i­ci­ans squirm.

It’s not at all clear to me what no­tion of “de­grees of free­dom” you’re us­ing here.

x^2 has one de­gree of free­dom. x can be any real number

1 < x < 1.1 has less free­dom than that. It can be any real num­ber be­tween 1 and 1.1

With the pre­vi­ous de­scrip­tion I’ve given of the differ­ence be­tween the re­sults of a “finite” and “in­finite” calcu­la­tion of the limit of 1.999… (the se­ries), “x = 1.999...” can be ei­ther 2 (if we can go to the limit or can think about it in a way out­side of the sum­ming-the-finite-se­ries method) or a very small num­ber less than two (if we be­gin calcu­lat­ing but have to stop calcu­lat­ing for some weird rea­son, such as run­ning out of time be­fore the heat-death of the uni­verse).

The “free­dom” in­volved here is even more limited than the free­dom of 1 < x < 1.1 and would not con­sti­tute a full “de­gree” of free­dom in the math­e­mat­i­cal sense. But in the way that I have already men­tioned above (quite un­der­stand­ing that this may not be the full math­e­mat­i­cally ap­proved way of rea­son­ing about it)… it can have more than one value (given the pre­vi­ously-stated con­texts) and thus may be con­sid­ered to have some “free­dom”. …even if it’s only be­tween “2″ and “a very, very small dis­tance from 2”

I’d like to think of it as a frac­tional de­gree of free­dom :)

I am re­ally get­ting the idea you should go read the stan­dard stuff on this and clear up any re­main­ing con­fu­sion that way, rather than try to ar­gue this here...

Firstly—there is no sur­prise that you are un­fa­mil­iar with my back­ground… as I haven’t speci­fi­cally shared it with you. But I hap­pen to have ac­tu­ally started in a maths de­gree. I had a dis­tinc­tion av­er­age, but didn’t en­joy it enough… so I switched to com­put­ing. I’m cer­tainly not a to­tal maths ex­pert (un­like my Dad and my maths-PhD cousin) but I would say that I’m fairly fa­mil­iar with “the stan­dard stuff”. Of course… as should be ob­vi­ous—this does not mean that er­rors do not still slip through (as I’ve re­cently just clearly learned).

Se­condly—with re­spect, I think that some of the con­fu­sion here is that you are con­fused as to what I’m talk­ing about… that is to­tally my fault for not be­ing clear—but it will not be cleared up by me go­ing away and re­search­ing any­thing… be­cause I think it’s more of a com­mu­ni­ca­tion is­sue than a knowl­edge-based one.

So… back to the point at hand.

I think I get what you’re try­ing to say with the type-er­ror ex­am­ple. But I don’t know that you quite get what I’m say­ing. That is prob­a­bly be­cause I’ve been say­ing it poorly…

I don’t know if you’ve pro­grammed in type­less pro­gram­ming lan­guages, but my origi­nal un­der­stand­ing is more along the lines of:

Lets say I have this ob­ject, and on the out­side it’s called “1.999...”

When I ask it “how do I calcu­late your value?” it can re­ply “well, you add 1 to 0.9 and then 0.09 and then 0.009...” and it keeps go­ing on and on… and if I write it down as it comes out… it looks just like the In­finite Series.

So then I ask it “what num­ber do you equate to if I get to the end of all that ad­di­tion?” and it says “2″ - and that looks like the Limit

I could even ask ask it “do you equal two?” and it could re­al­ise that I’m ask­ing it to calcu­late its limit and say “yes”

But then I ac­tu­ally try the ad­di­tion in the Series my­self… and I go on and on and on… and each next value looks like the next num­ber in the Sequence

but even­tu­ally I get bored and stop… and the num­ber I have is not quite 2… al­most, but not quite… which is the Finite Rep­re­sen­ta­tion that I keep talk­ing about.

Then you can see that this ob­ject matches all the prop­er­ties that I have men­tioned in my pre­vi­ous dis­cus­sion… no type-er­rors re­quired, and each “value” comes nat­u­rally from the given con­text.

That “ob­ject” is what I have in my head when I’m talk­ing about some­thing that can be both the num­ber and the se­quence, and in which it can re­veal the prop­er­ties of it­self de­pend­ing on how you ask it.

...it’s also a rea­son­ably good ex­am­ple of duck-typ­ing ;)

• We want it to de­note a num­ber for sim­ple con­sis­tency. .11111… is a num­ber. It is a limit. 3.14159… should de­note a num­ber. Why should 1.99999?… Be any differ­ent? If we are go­ing to be at all con­sis­tent in our no­ta­tion they should all rep­re­sent the same sort of se­ries. Other­wise this is ex­tremely ir­reg­u­lar no­ta­tion to no end.

• Yes, I to­tally agree with you: con­sis­tency and con­ve­nience are why we have cho­sen to use 1.9999… no­ta­tion to rep­re­sent the limit, rather than the se­quence.

con­sis­tency and con­ve­nience tends to drive most math­e­mat­i­cal no­ta­tional choices (with oc­ca­sional other in­fluences), for rea­sons that should be ex­tremely ob­vi­ous.

It just so hap­pened that, o this oc­ca­sion, I was not aware enough of ei­ther the ac­tual con­ven­tion, or of other “things that this no­ta­tion would be con­sis­tent with” be­fore I guessed at the mean­ing of this par­tic­u­lar item of no­ta­tion.

And so my guessed mean­ing was one of the two things that I thought would be “likely mean­ings” for the no­ta­tion.

In this case, my guess was for the wrong one of the two.

I seem to be get­ting a lot of com­ments that are im­ply­ing that I should have some­how nat­u­rally re­al­ised which of the two mean­ings was “cor­rect”… and have tried very hard to ex­plain why it is not ob­vi­ous, and not some­how in­evitable.

Both of my pos­si­ble in­ter­pre­ta­tions were po­ten­tially valid, and I’d like to in­sist that the se­quence-one is wrong only by con­ven­tion (ie maths has to pick one or the other mean­ing… it hap­pens to be the most con­ve­nient for math­e­mat­i­ci­ans, which hap­pens in this case to be the limit-in­ter­pre­ta­tion)… but as is clearly ev­i­denced by the fact that there is so much con­fu­sion around the sub­ject (ref the wikipe­dia page) - it is not ob­vi­ous in­tu­itively that one is “cor­rect” and one is “not cor­rect”.

I main­tain that with­out knowl­edge of the con­ven­tion, you can­not know which is the “cor­rect” in­ter­pre­ta­tion. Any as­sump­tion oth­er­wise is sim­ply hind­sight bias.

• it is not ob­vi­ous in­tu­itively that one is “cor­rect” and one is “not cor­rect”.

There is no in­her­ent mean­ing to a set of sym­bols scrawled on pa­per. There is no “cor­rect” and “in­cor­rect” way of in­ter­pret­ing it; only con­ven­tion (un­less your goal is to com­mu­ni­cate with oth­ers). There is no Pla­tonic Ideal of Math­e­mat­i­cal No­ta­tion, so ob­vi­ously there is no ob­jec­tive way to pluck the “cor­rect” in­ter­pre­ta­tion of some sym­bols out of the in­ter­stel­lar void. You are right in as far as you say that.

How­ever, you are ex­pected to know the mean­ing of the no­ta­tion you use in ex­actly the same way that you are ex­pected to know the mean­ing of the words you use. Not know­ing is un­der­stand­able, but ob­serv­ing that it is pos­si­ble to not-know a con­ven­tion is not a par­tic­u­lar philo­soph­i­cal in­sight.

• For what it’s worth (and why do I have to pay karma to re­ply to this com­ment, I don’t get it) there is an in­finites­i­mal differ­ence be­tween the two. An in­finites­i­mal is just like in­finity in that it’s not a real num­ber. For all prac­ti­cal pur­poses it is equal to zero, but just like in­finity, it has use­ful math­e­mat­i­cal pur­poses in that it isn’t ex­actly equal to zero. You could plug an in­finites­i­mal into an equa­tion to show how close you can get to zero with­out ac­tu­ally get­ting there. If you just re­placed it with zero the equa­tion could come out un­defined or some­thing.

Like­wise us­ing 1.999… be­cause of the prop­erty that it isn’t ex­actly equal to 2 but is prac­ti­cally equal to 2, could be use­ful.

• er… I’m not sure if this is the right way to look at it.

1.999999… is 2. Ex­actly 2. The thing is, there is an in­finites­i­mal differ­ence be­tween ‘2’ and ‘2’. 1.999999.… isn’t “Two minus ep­silon”, it’s “The limit of two minus ep­silon as ep­silon ap­proaches zero”, which is two.

EDIT: And to ex­plain the fol­low­ing ob­jec­tion:

Weird things hap­pen when you ap­ply in­finity, but can it re­ally change a rule that is true for all finite num­bers?

Yes, ab­solutely. That’s part of the point of in­finity. One way of look­ing at cer­tain kinds of in­finity (note that there are sev­eral kinds of in­finity) is that in­finity is one of our place­hold­ers for where rules break down.

• This is one of those things that isn’t worth ar­gu­ing over at all, but I will any­ways be­cause I’m in­ter­ested. I’m prob­a­bly wrong be­cause peo­ple much smarter than me have thought about this be­fore, but this still doesn’t make any sense to me at all.

1.9 is just 2 minus 0.1, right? And 1.99 is just 2 minus 0.01. Each time you add an­other 9, you are di­vid­ing the num­ber you are sub­tract­ing by 10. No mat­ter how many times you di­vide 0.1 by ten, you will never ex­actly reach zero. And if it’s not ex­actly zero, then two minus the num­ber isn’t ex­actly two.

Even if you do it 3^^^3 times, it will still be more than zero. Weird things hap­pen when you ap­ply in­finity, but can it re­ally change a rule that is true for all finite num­bers? You can say it ap­proaches 2 but that’s not the same as it ever ac­tu­ally reach­ing it. Does this make any sense?

• In­ter­est­ing… three down-votes but only one soul kind enough to point out why what I said was wrong (thank you ci­pher­goth).

I find that quite dis­ap­point­ing—es­pe­cially as I’ve seen some de­liber­ate troll-bait­ing re­ceive fewer down-votes.

• Me: AGI is a William Tell tar­get. A near miss could be very un­for­tu­nate. We can’t re­spon­si­bly take a proper shot till we have an ap­pro­pri­ate level of un­der­stand­ing and con­fi­dence of ac­cu­racy.
Cale­do­nian: That’s not how William Tell man­aged it. He had to prac­tice aiming at less-dan­ger­ous tar­gets un­til he be­came an ex­pert, and only then did he at­tempt to shoot the ap­ple.

Yes, by “take a proper shot” I meant shoot­ing at the proper tar­get with proper shots. And yes, prac­tice on less-dan­ger­ous tar­gets is nec­es­sary, but it’s not suffi­cient.

It is not clear to me that it is de­sir­able to pre­judge what an ar­tifi­cial in­tel­li­gence should de­sire or con­clude, or even pos­si­ble to pur­pose­fully put real con­straints on it in the first place. We should sim­ply cre­ate the god, then ac­knowl­edge the truth: that we aren’t ca­pa­ble of eval­u­at­ing the think­ing of gods.

I agree we can’t ac­cu­rately eval­u­ate su­per­in­tel­li­gent thoughts, but that doesn’t mean we can’t or shouldn’t try to af­fect what it thinks or what it’s goals are.

I couldn’t do this ar­gu­ment jus­tice. I en­courage in­ter­ested read­ers to read Eliezer’s pa­per on co­her­ent ex­trap­o­lated vo­li­tion.

• Nomin­ull, I kind of agree that they are the same at the limit of in­finite digits (as­sum­ing by 2 you mean 2.000...). It just seems to me that work­ing with num­bers that are sub­ject to this kind of limit is the wrong ap­proach to math­e­mat­ics if we want maths to be tied to some­thing real in this uni­verse, es­pe­cially when the limit is im­plicit and hid­den in the no­ta­tion.

• No, by 2 I mean 1.999...

A_A

• Benoit,

1,9999.… can only be the same (or equal) to 2 in some kind of imag­i­nary world. The num­ber 1,999… where there is an in­finity of 9′s does not “ex­ist” in so far as it can­not be “rep­re­sented” in a finite amount of space or time. The only way out is to “rep­re­sent” in­finity by (...). So you rep­re­sent some­thing in­finite by some­thing finite, thus avoid­ing a se­ri­ous prob­lem. But then stat­ing that 1,999… is equal to 2 be­comes a tau­tol­ogy.

Of course math­e­mat­i­ci­ans now are used to deal with in­fini­ties. They can ma­nipu­late them any which way they want. But in the end, in­finity has no equiv­a­lent in the “real” world. It is a use­ful ab­strac­tion.

So back to ar­ith­metic. We can only “count” be­cause our phys­i­cal world is a quan­tum world. We have units be­cause the ba­sic el­e­ments are units, like el­e­men­tary par­ti­cles. If the real world were a con­tinuum, there would be no ar­ith­metic. Fur­ther­more, ar­ith­metic is a fea­ture of the macro­scopic world. When you look closer, it breaks down. In quan­tum physics, 1+1 is not always equal to two. You can have many par­ti­cles in the same quan­tum state that are in­dis­t­in­guish­able. How do you count sheep when you can’t dis­t­in­guish them?

I don’t see any­thing “ob­vi­ous” in stat­ing that 1+1=2. It’s only a con­ven­tion. “1″ is a sym­bol. “2” is an­other sym­bol. Trace it back to the “real” world, and you find that to have one ob­ject plus an­other of the same ob­ject (but dis­tinct) re­quires sub­tle phys­i­cal con­di­tions.

On an­other note, ar­ith­metic is a re­cent in­ven­tion for hu­man­ity. Early peo­ple couldn’t count to more than about 5, if not 3. Our brain is not that good at count­ing. That’s why we learn ar­ith­metic ta­bles by heart, and count with our fingers. We have not “evolved” as ar­ith­meti­ci­ans.

• On an­other note, ar­ith­metic is a re­cent in­ven­tion for hu­man­ity. Early peo­ple couldn’t count to more than about 5, if not 3.

If we were on wikipe­dia, I could add [Ci­ta­tion needed] to this state­ment :)

Also—can you spec­ify what you mean by “re­cent”: 10,000 years? 4,000 years? 800 years? Last week ?

• “Trace it back to the “real” world, and you find that to have one ob­ject plus an­other of the same ob­ject (but dis­tinct) re­quires sub­tle phys­i­cal con­di­tions.”

Are there ob­jects and this no­tion of “same but dis­tinct” in the “real” world? I think if you stop at ob­jects, you haven’t traced back far enough. (By the way has there been much/​any dis­cus­sion of ob­jects on LW that I’ve missed?)

• I agree that in­finity is an ab­strac­tion. What I’m try­ing to say is that this con­cept is of­ten abused when it is taken as im­plicit in real num­bers.

“We can only “count” be­cause our phys­i­cal world is a quan­tum world. We have units be­cause the ba­sic el­e­ments are units, like el­e­men­tary par­ti­cles. If the real world were a con­tinuum, there would be no ar­ith­metic.”

I don’t see it that way. In Eu­clid’s book, vari­ables are as­signed to seg­ment lengths and other ge­ome­tries that tie alge­bra to ge­o­met­ric in­ter­pre­ta­tions. IMO, when math­e­mat­ics stray away from some­thing that can be in­ter­preted phys­i­cally it leads to con­fu­sion and er­rors.

What I’d like to see is a defi­ni­tion of real num­bers that is closer to re­al­ity and that al­lows us to en­code our knowl­edge of re­al­ity more effi­ciently. A defi­ni­tion that does not al­low ab­stract limits and in­finite pre­ci­sion. Us­ing the “sig­nifi­cant digits” in­ter­pre­ta­tion seems to be a step in the right di­rec­tion to me as all of our mea­sure­ment and knowl­edge is sub­ject to some kind of er­ror bar.

We could for ex­am­ple, define a set of real num­bers such that we always use as many digit needed so that the quan­ti­za­tion er­ror from the limited num­ber of digits is un­der a hun­dred times smaller than the er­ror in the value we are mea­sur­ing. This way, the er­ror caused by the use of this real num­ber sys­tem would always ex­plain less than a 1% of the var­i­ance of our mea­sure­ments based on it.

This also seem to re­quire that we dis­t­in­guish math­e­mat­ics on nat­u­ral num­bers which rep­re­sent countable whole items, and math­e­mat­ics that rep­re­sent con­tin­u­ous scales which would be best rep­re­sented by the real num­bers sys­tem with the limited sig­nifi­cant digits.

Now this is just an idea, I’m just an am­a­teur math­e­mat­i­cian but I think it could re­solve a lot of is­sues and para­doxes in math­e­mat­ics.

• 1.9999… = 2 is not an “is­sue” or a “para­dox” in math­e­mat­ics.

If you use a limited num­ber of digits in your calcu­la­tions, then your quan­ti­za­tion er­rors can ac­cu­mu­late. (And sup­pose the quan­tity you are mea­sur­ing is the differ­ence of two much larger num­bers.)

Of course it’s pos­si­ble that there’s noth­ing in the real world that cor­re­sponds ex­actly to our so-called “real num­bers”. But un­til we ac­tu­ally know what smaller-scale struc­ture it is that we’re ap­prox­i­mat­ing, it would be crazy to pick some ar­bi­trary “lower-re­s­olu­tion” sys­tem and hope it matches the world bet­ter. That’s do­ing for “finite­ness” what Eliezer has some­where or other com­plained about peo­ple do­ing for “com­plex­ity”.

• ″...math­e­mat­ics that rep­re­sent con­tin­u­ous scales which would be best rep­re­sented by the real num­bers sys­tem with the limited sig­nifi­cant digits.”

If you limit the num­ber of sig­nifi­cant digits, your math­e­mat­ics are dis­crete, not con­tin­u­ous. I’m guess­ing the con­cept you’re re­ally af­ter is the idea of com­putable num­bers. The set of com­putable num­bers is a dense countable sub­set of the re­als.

• “Pocket calcu­la­tors work by stor­ing a gi­ant lookup table of ar­ith­meti­cal facts”.

you can’t cre­ate a lookup table with­out proper math.

• With the graph­i­cal-net­work in­sight in hand, you can give a math­e­mat­i­cal ex­pla­na­tion of ex­actly why first-or­der logic has the wrong prop­er­ties for the job, and ex­press the cor­rect solu­tion in a com­pact way that cap­tures all the com­mon-sense de­tails in one el­e­gant swoop.

Con­sider the fol­low­ing ex­am­ple, from Men­z­ies’s “Causal Models, To­ken Cau­sa­tion, and Pro­cesses”[*]:

An as­sas­sin puts poi­son in the king’s coffee. The body­guard re­sponds by pour­ing an an­ti­dote in the king’s coffee. If the body­guard had not put the an­ti­dote in the coffee, the king would have died. On the other hand, the an­ti­dote is fatal when taken by it­self and if the poi­son had not been poured in first, it would have kil­led the king. The poi­son and the an­ti­dote are both lethal when taken singly but neu­tral­ize each other when taken to­gether. In fact, the king drinks the coffee and sur­vives.

We can model this situ­a­tion with the fol­low­ing struc­tural equa­tion sys­tem:

A = true G = A S = (A and G) or (not-A and not-G)

where A is a boolean vari­able de­not­ing whether the As­sas­sin put poi­son in the coffee or not, G is a boolean vari­able de­not­ing whether the Guard put the an­ti­dote in the coffee or not, and S is a boolean vari­able de­not­ing whether the king Sur­vives or not.

Ac­cord­ing to Pearl and Halpern’s defi­ni­tion of ac­tual cau­sa­tion, the as­sas­sin putting poi­son in the coffee causes the king to sur­vive, since chang­ing the as­sas­sin’s ac­tion changes the king’s sur­vival when we hold the guard’s ac­tion fixed. This is clearly an in­cor­rect ac­count of cau­sa­tion.

IMO, graph­i­cal mod­els and re­lated tech­niques rep­re­sent the biggest ad­vance in think­ing about causal­ity since Lewis’s work on coun­ter­fac­tu­als (though James Heck­man dis­agrees, which should make us a bit more cir­cum­spect). But they aren’t the end of the line, even if we re­strict our at­ten­tion to ma­nipu­la­tion­ist ac­counts of causal­ity.

[*] The pa­per is found here. As an aside, I do not agree with Men­z­ies’s pro­posed re­s­olu­tion.

• Um, this sounds not cor­rect. The as­sas­sin causes the body­guard to add the an­ti­dote; if the body­guard hadn’t seen the as­sas­sin do it, he wouldn’t have so added. So if you com­pute the coun­ter­fac­tual the Pear­lian way, ma­nipu­lat­ing the as­sas­sin changes the body­guard’s ac­tion as well, since the body­guard causally de­scends from the as­sas­sin.

• Right—and ac­cord­ing to Pearl’s causal beam method, you would first note that the guard sus­tains the coffee’s (non)dead­li­ness-state against the as­sas­sin’s ac­tion, which ul­ti­mately makes you deem the guard the cause of the king’s sur­vival.

• Fur­ther­more, if you draw the graph the way Neel seems to sug­gest, then the body­guard is adding the an­ti­dote with­out de­pen­dence on the ac­tions of the as­sas­sin, and so there is no longer any rea­son to call one “as­sas­sin” and the other “body­guard”, or one “poi­son” and the other “an­ti­dote”. The body­guard in that model is try­ing to kill the king as much as the as­sas­sin is, and the as­sas­sin’s timely in­ter­ven­tion saved the king as much as the body­guard’s.

• “But un­til we ac­tu­ally know what smaller-scale struc­ture”.

From http://​​en.wikipe­dia.org/​​wiki/​​Planck_Length: “Com­bined, these two the­o­ries im­ply that it is im­pos­si­ble to mea­sure po­si­tion to a pre­ci­sion greater than the Planck length, or du­ra­tion to a pre­ci­sion greater than the time a pho­ton trav­el­ing at c would take to travel a Planck length”

There­fore, one could in fact say that all time- and dis­tance- de­rived mea­sure­ments can in fact be trun­cated to a fixed num­ber of dec­i­mal places with­out los­ing any real pre­ci­sion, by us­ing pre­ci­sions based on the Planck Length. There’s no point in hav­ing pre­ci­sion smaller than the limits in the quote above, as any­thing smaller is un­ob­serv­able in our cur­rent un­der­stand­ing of physics.

That length is ap­prox­i­mately 1.6 x 10^-35, and the cor­re­spond­ing time du­ra­tion is ap­prox­i­mately 5.33702552 x 10^-44 sec­onds.

• “When the ba­sic prob­lem is your ig­no­rance, clever strate­gies for by­pass­ing your ig­no­rance lead to shoot­ing your­self in the foot.”

I like this les­son. It rings true to me, but the prob­lem of ego is not one to be over­looked. Peo­ple like feel­ing smart and hav­ing the sta­tus of be­ing a “learned” in­di­vi­d­ual. It takes a lot of courage to pro­fess ig­no­rance in to­day’s aca­demic cli­mate. We are taught that we have such so­phis­ti­cated tech­niques to solve re­ally hard prob­lems. There are armies of sci­en­tists and en­g­ineers work­ing to ad­vance our so­ciety ev­ery minute. But who stops and asks “if these guys (and gals) are so smart, why is it that such fun­da­men­tal ig­no­rance still ex­ists in so many fields”? Yes, there are our cur­rent the­o­ries, but how many of them are truly im­pres­sive? How many log­i­cally fol­low from the con­text vs. how many took a truly cre­ative break­through? The myth of re­duc­tion­ism promises steady progress, but it is the in­di­vi­d­ual who gets in­spired. It boils down to hu­mil­ity. Man is too ar­ro­gant to ad­mit that he is still clue­less on many fun­da­men­tal prob­lems. How could that pos­si­bly be true if we are all so smart in our mod­ern age? Who amongst you will ad­mit when some­thing that seems very so­phis­ti­cated ac­tu­ally makes no sense? You’ll prob­a­bly just feel stupid for not un­der­stand­ing, but the prob­lem is not nec­es­sar­ily with you. Dogma creeps into any or­ga­ni­za­tion of peo­ple, and sci­ence is no differ­ent. We as­sume our level of un­der­stand­ing in cer­tain sub­jects ap­plies equally to all. Un­til peo­ple have the courage to ques­tion very fun­da­men­tal as­sump­tions on how we ap­proach new prob­lems, we will not progress, or worse, we will find much work has been done on a faulty foun­da­tion. Figur­ing out the right ques­tion to ask is the most im­por­tant hur­dle of all. But who has time when we are judged not by the qual­ity of our thought but by the quan­tity? Some very im­por­tant minds only pro­duced a hand­ful of pa­pers, but they were worth read­ing...

• anony­mous—I’d like to sec­ond that motion

• I read a book on the philos­o­phy of set the­ory—and I get lost right at the point where clas­si­cal in­finite thought was re­placed by mod­ern in­finite thought. IIRC the prob­lem was para­doxes based on in­finite re­cur­sion (Zeno et. all) and find­ing math­e­mat­i­cal foun­da­tions to satisfy calcu­lus limits. Then some­thing about Can­tor, car­di­nal­ity and some hand wavy ‘in­finite sets are real!’.

1.999… is just an in­finite set sum­ma­tion of finite num­bers 1 + 0.9 + 0.09 + …

Now, how an in­finite pro­cess on an in­finite set can equal an in­te­ger is a prob­lem I still grap­ple with. Clas­si­cal the­ory said that this was non­sense since one would never finish the sum­ma­tion (if one were to be­gin). I tend to agree and I sup­pose one could say I see in­finity as a verb and not a noun.

I sug­gest any­one who be­lieves 1.999… === 2 re­ally looks into what that means. The root of the ar­gu­ment isn’t “What is the num­ber be­tween 1.999… and 2?” but rather “Can we say that 1.999… is a sen­si­ble the­o­ret­i­cal con­cept?”

• Clas­si­cal the­ory said that this was non­sense since one would never finish the sum­ma­tion (if one were to be­gin).

It was non­sense in clas­si­cal the­ory. In­finite sum has its own sep­a­rate defi­ni­tion.

I tend to agree and I sup­pose one could say I see in­finity as a verb and not a noun.

There are times in mod­ern math­e­mat­ics that in­finite num­bers are used. This is not one of them.

I doubt I’m the best at ex­plain­ing what limits are, so I won’t bother. I may be able to tell you what they aren’t. They give re­sults similar to the in­tu­itive idea of in­finite num­bers, but they don’t do it in the most in­tu­itively ob­vi­ous way. They don’t use in­finite num­bers. They use a cer­tain prop­erty that at most one num­ber will have in re­la­tion to a se­quence. In the case of 1, 1.9, 1.99, …, this num­ber is two. In the case of 1, 0, 1, 0, …, there is no such num­ber, so the se­ries is said not to con­verge.

… “Can we say that 1.999… is a sen­si­ble the­o­ret­i­cal con­cept?”

No. The ques­tion is “Can we make a sen­si­ble the­o­ret­i­cal way to in­ter­pret the nu­meral 1.999..., that ap­prox­i­mately matches our in­tu­itions?” It wasn’t easy, but we man­aged it.

• 1.999… does not equal 2 - it just tends to­wards 2

For all prac­ti­cal pur­poses, you could sub­sti­tute one for the other.

But in the­ory, you know that 1.9999… is always just be­low 2, even though it creeps ever closer.

If we ever found a way to mag­ick­ally “reach in­finity” they would fi­nally meet… and be “equal”.

Edit: The num­bers are always go­ing to be slightly differ­ent in a finite-space, but equate to the same thing when you al­low in­fini­ties. ie math­e­mat­i­cally, in the limit, they equate to the same value, but in any finite rep­re­sen­ta­tion, they are differ­ent.

Fur­ther Edit: Ac­cord­ing to math­e­mat­i­cal con­ven­tion, the no­ta­tion “1.999...” does re­fer to the limit. there­fore, “1.999...” strictly refers to 2 (not to any finite case that is slightly less than two).

• The is­sue with AI has noth­ing to do with ig­no­rance or ar­ro­gance. The ba­sic prob­lem is that in­tel­li­gence can’t be mean­ingfully defined or mean­ingfully quan­tified. Doc­u­mented fact: Richard Feyn­man had a mea­sured I.Q. of 120. Doc­u­mented fact: Mar­ilyn Vos Sa­vant had a mea­sured I.Q. of 180 or 200, de­pend­ing on which test you place more faith in. Doc­u­mented fact: Feyn­man made a huge break­through in physics, Vos Sa­vant has ac­com­plished noth­ing worth men­tion­ing in her life. I.Q. mea­sure­ments fail to mea­sure in­tel­li­gence in any mean­ingful way.

Here’s an­other fact for you. Louis Ter­man col­lected a group of so-called “ge­niuses” sieved by their high I.Q. scores. Two fu­ture no­bel prize win­ners, Shock­ley and Al­varez, got tested but dis­carded by Ter­man’s I.Q. tests and weren’t part of the group.

Ques­tion: What does this tell you about cur­rent meth­ods for mea­sur­ing in­tel­li­gence?

There is no ev­i­dence that peo­ple can mean­ingfully define or ob­jec­tively mea­sure in­tel­li­gence. Rule of thumb: if you can’t define it and you can’t mea­sure it ob­jec­tively, you can’t do sci­ence about it.

[Re­main­der of gi­gan­tic com­ment trun­cated by ed­i­tor.]

• I’m sur­prised no­body brought this up at the time, but it’s tel­ling that you’ve only picked out ex­am­ples of hu­mans when dis­cussing in­tel­li­gence, not bac­te­ria or rocks or the color blue. I sub­mit that the prop­erty is not as un­know­able as you would sug­gest.

• The prob­lem isn’t that it can’t be mean­ingfully defined or quan­tified. The prob­lem is that it hasn’t been. I have no idea how hard it is to do that. It may very well be be­yond any­thing any hu­man can do, but it’s the­o­ret­i­cally pos­si­ble.

In the hy­po­thetic uni­verse, ad­di­tion cer­tainly could be defined, it’s just that no­body in that uni­verse knew how.

• In­tel­li­gence is a mul­ti­di­men­sional con­cept that is not amenable to any sin­gle defi­ni­tion or quan­ti­za­tion. Take for in­stance the idea of “the size of a tree.” Size could mean height, drip ra­dius, mass, vol­ume of small­est con­vex poly­he­dron that con­tains the whole or­ganism, vol­ume of wa­ter dis­placed if the tree was im­mersed in a tank, trunk girth at 6 feet, etc. The tallest red­wood is taller than the tallest se­quoia, but isn’t the se­quoia big­ger? Why is it big­ger? Be­cause it has greater mass? But what of the biggest banyan? It has a greater mass than both the red­wood and the se­quoia.

The prob­lem with in­tel­li­gence is not that it’s not quan­tifi­able, but that differ­ent re­searchers use differ­ent map­ping func­tions all the while pre­tend­ing they’re mea­sur­ing the ex­act same thing, heap­ing up the con­fu­sion. If you pick one spe­cific men­tal ac­tivity (ar­ith­metic, vi­sual mem­ory, mu­sic-com­po­si­tional abil­ity, lan­guage pro­cess­ing), it is rarely very difficult to mea­sure and rank peo­ple by their adept­ness. If, on the other hand, you try to come up with a “good” way to map many differ­ent in­tel­li­gences to­gether onto some scale, you’re go­ing to be ter­rible at us­ing this scale to pre­dict in­di­vi­d­ual perfor­mance at spe­cific tasks. Fur­ther, in­di­vi­d­u­als with low IQ (or other at­tempted mea­sure at gen­eral in­tel­li­gence) may be brilli­ant at spe­cific tasks be­cause of their low IQ in that be­cause much of their brain is ded­i­cated to that task, they have lit­tle left over for any­thing else. This is es­pe­cially true of many autis­tic in­di­vi­d­u­als.

In the end, in­tel­li­gence is rather easy to define if you rec­og­nize it as the mul­ti­faceted phe­nom­ena that it is.

• No­bel prizes aren’t based only on in­tel­li­gence, but also on drive, per­sis­tence and also a lit­tle luck (mainly the luck to find some­thing in­ter­est­ing to work on that no­body else has yet solved).

After all “1% in­spira­tion and 99% per­spira­tion” yes?

• Drive and per­sis­tence are part of in­tel­li­gence, at least in the sense that any use­ful AI would have to have them. Say­ing it mea­sures luck is just say­ing that it’s im­pre­cise.

That said, it’s not go­ing to mea­sure all the differ­ent com­po­nents of in­tel­li­gence in the way we want.

• No­bel prizes are mea­sur­ing some­thing (or, more likely, a bunch of things), but is it a good match for what we mean by in­tel­li­gence?

• Ques­tion: What does this tell you about cur­rent meth­ods for mea­sur­ing in­tel­li­gence?

Bet­ter ques­tion: why do you in­sist that those ex­am­ples are of failures to ac­knowl­edge in­tel­li­gence when you also in­sist that we are un­able to mean­ingfully define in­tel­li­gence?

• mclaren, your com­ment is way too long. I have trun­cated it and emailed you the full ver­sion. Feel free to post the com­ment to your blog, then post a link to the blog here.

• Anony­mous (re Planck scales etc.), sure you can trun­cate your rep­re­sen­ta­tions of lengths at the Planck length, and like­wise for your rep­re­sen­ta­tions of times, but this doesn’t sim­plify your num­ber sys­tem un­less you have ac­cept­able ways of trun­cat­ing all the other num­bers you need to use. And, at pre­sent, we don’t. Sure, maybe re­ally the uni­verse is best con­sid­ered as some sort of dis­crete net­work with some funky struc­ture on it, but that doesn’t give us any way of sim­plify­ing (or mak­ing more ap­pro­pri­ate) our math­e­mat­ics un­til we know just what sort of dis­crete net­work with what funky struc­ture. (And I think ev­ery sketch-of-a-the­ory we cur­rently have along those lines still uses con­tin­u­ously vary­ing quan­tities as quan­tum “am­pli­tudes”, too.)

James (re math­e­mat­ics and in­finite sets and such­like), it seems un­fair to crit­i­cize some­thing as be­ing hand­wavy when you demon­stra­bly don’t re­mem­ber it clearly; how do you know that the vague­ness is in the thing it­self rather than your rec­ol­lec­tion? There is a perfectly clear and sim­ple defi­ni­tion of what a sum like 1 + 910 + 9100 + … means (which, btw, is surely enough to call it “a sen­si­ble the­o­ret­i­cal con­cept”), and what that par­tic­u­lar one means is 2. If you have a differ­ent defi­ni­tion, or a differ­ent way of do­ing math­e­mat­ics, that you like bet­ter, then feel free to adopt it and do math­e­mat­ics that way; if you end up with a the­ory at least as co­her­ent, use­ful and el­e­gant as the usual one then per­haps it’ll catch on.

Anony­mous (re hu­mil­ity, re­duc­tion­ism, etc.): I think your com­ment con­sisted mostly of ap­plause lights. Science is demon­stra­bly pretty good at ques­tion­ing fun­da­men­tal as­sump­tions (con­sider, say, he­lio­cen­tric­ity, rel­a­tivity, quan­tum me­chan­ics, con­ti­nen­tal drift); what ev­i­dence have you that more effort should go into ques­tion­ing them than cur­rently does? (Clearly some should, and does. Clearly much effort spent that way is wasted, and pro­duces pseu­do­science or merely frus­tra­tion. The ques­tion is how to ap­por­tion the effort.)

• Thanks g for the tip about com­putable num­bers, that’s pretty much what I had in mind. I didn’t quite get from the wikipe­dia ar­ti­cle if these num­bers could or could not re­place the re­als for all of use­ful math­e­mat­ics but it’s in­ter­est­ing in­deed.

• James, I share your feel­ings of un­easi­ness about in­finite digits, as you said, the prob­lem is not that these num­bers will not rep­re­sent the same points at the limit but that they shouldn’t be taken to the limit so read­ily as this doesn’t seem to add any­thing to math­e­mat­ics but con­fu­sion.

• @James:

If I re­call my New­ton cor­rectly, the only way to take this “sum of an in­finite se­ries” busi­ness con­sis­tently is to in­ter­pret it as short­hand for the limit of an in­finite se­ries. (Cf. New­ton’s Prin­cipia Math­e­mat­ica, Lemma 2. The in­finites­i­mally wide par­allel­o­grams are du­bitably real, but the area un­der the curve be­tween the sets of par­allel­o­grams is clearly a real, definite area.)

@Benoit:

Why shouldn’t we take 1.9999… as just an­other, need­lessly com­pli­cated (if there’s no jus­tify­ing con­text) way of writ­ing “2”? Just as I could con­ceiv­ably count “1, 2, 3, 4, d(5x)/​dx, 6, 7″ if I were a crazy per­son.

• Ben­quo, I see two pos­si­ble rea­sons:

1) ‘2’ leads to con­fu­sion as to whether we are rep­re­sent­ing a real or a nat­u­ral num­ber. That is, whether we are count­ing dis­crete items or we are rep­re­sent­ing a value on a con­tinuum. If we are count­ing items then ‘2’ is cor­rect.

2) If it is clear that we are rep­re­sent­ing num­bers on a con­tinuum, I could see the num­ber of sig­nifi­cant digits used as an in­di­ca­tion of the amount of un­cer­tainty in the value. For any real prob­lem there is always un­cer­tainty caused by A) the mea­sur­ing in­stru­ment and B) the rep­re­sen­ta­tion sys­tem it­self such as the com­putable num­bers which are limited by a finite amount of digits (al­though we get to choose the un­cer­tainty here as we choose the num­ber of digits). This is one of the rea­son the in­finite limits don’t seem use­ful to me. They don’t cor­re­spond to re­al­ity. The im­plicit limits seems to lead to slop­piness in deal­ing with un­cer­tainty in num­ber rep­re­sen­ta­tion.

For ex­am­ple I find am­bi­guity in writ­ing 13 = 0.333… How­ever, 1.000/​3.000 = 0.333 or even 1.000.../​3.000...=0.333… make more sense to me as it is clear where there is un­cer­tainty or where we are tak­ing in­finite limits.

• Benoit Es­si­am­bre,

Right now Wikipe­dia’s ar­ti­cle is claiming that calcu­lus can­not be done with com­putable num­bers, but a Google search turned up a pa­per from 1968 which claims that differ­en­ti­a­tion and in­te­gra­tion can be performed on func­tions in the field of com­putable num­bers. I’ll go and fix Wikipe­dia, I sup­pose.

• eh? maths is well defined and well struc­tured etc. in­tu­itive think­ing isn’t and so can’t be en­coded into a com­puter pro­gram very eas­ily, that was the whole point of min­sky’s pa­per! are you a bit thick or some­thing??

• Benoit Es­si­am­bre,

You say:

“1) ‘2’ leads to con­fu­sion as to whether we are rep­re­sent­ing a real or a nat­u­ral num­ber. That is, whether we are count­ing dis­crete items or we are rep­re­sent­ing a value on a con­tinuum.”

If I re­call cor­rectly, this “con­fu­sion” is what al­lowed mod­ern, atomic chem­istry. Chem­i­cal sub­stances—mea­sured as con­tin­u­ous quan­tities—seem to com­bine in sim­ple nat­u­ral-num­ber ra­tios. This was the pri­mary ev­i­dence for the ex­is­tence of atoms.

What is the prac­ti­cal nega­tive con­se­quence of the con­fu­sion you’re try­ing to avoid?

You also say:

“2) If it is clear that we are rep­re­sent­ing num­bers on a con­tinuum, I could see the num­ber of sig­nifi­cant digits used as an in­di­ca­tion of the amount of un­cer­tainty in the value. For any real prob­lem there is always un­cer­tainty caused by A) the mea­sur­ing in­stru­ment and B) the rep­re­sen­ta­tion sys­tem it­self such as the com­putable num­bers which are limited by a finite amount of digits (al­though we get to choose the un­cer­tainty here as we choose the num­ber of digits). This is one of the rea­son the in­finite limits don’t seem use­ful to me. They don’t cor­re­spond to re­al­ity. The im­plicit limits seems to lead to slop­piness in deal­ing with un­cer­tainty in num­ber rep­re­sen­ta­tion.”

But wouldn’t good sig-fig prac­tice round 1.999… up to some­thing like 2.00 any­way?

• Benoit, it was “Cyan” and not me who men­tioned com­putable num­bers.

• Benoit, you as­sert that our use of real num­bers leads to con­fu­sion and para­dox. Please point to that con­fu­sion and para­dox.

Also, how would your pro­posed num­ber sys­tem rep­re­sent pi and e? Or do you think we don’t need pi and e?

• Well, for ex­am­ple, the fact that two differ­ent real rep­re­sent the same point. 2.00… 1.99… , the fact that they are not com­putable in a finite amount of time. pi and e are quite rep­re­sentable within a com­putable num­ber sys­tem oth­er­wise we couldn’t re­li­ably use pi and e on com­put­ers!

• Benoit, those are two differ­ent ways of writ­ing the same real, just like 0.333… and 13 (or 1.0/​3.0, if you in­sist) are the same num­ber. That’s not a para­dox. 2 is a com­putable num­ber, and thus so are 2.000… and 1.999..., even though you can’t write down those ways of ex­press­ing them in a finite amount of time. See the defi­ni­tion of a com­putable num­ber if you’re con­fused.

1.999… = 2.000… = 2. Pe­riod.

• Benoit,

In the dec­i­mal nu­meral sys­tem, ev­ery num­ber with a ter­mi­nat­ing dec­i­mal rep­re­sen­ta­tion also has a non-ter­mi­nat­ing one that ends with re­cur­ring nines. Hence, 1.999… = 2, 0.74999… = 0.75, 0.986232999… = 0.986233, etc. This isn’t a para­dox, and it has noth­ing to do with the pre­ci­sion with which we mea­sure ac­tual real things. This sort of re­cur­ring rep­re­sen­ta­tion hap­pens in any po­si­tional nu­meral sys­tem.

You seem very con­fused as to the dis­tinc­tion be­tween what num­bers are and how we can rep­re­sent them. All I can say is, these mat­ters have been well thought out, and you’d profit by read­ing as much as you can on the sub­ject and by try­ing to avoid get­ting too caught up in your pre­con­cep­tions.

• I could al­most con­vince my­self that you know some­thing I don’t about the way calcu­la­tors work, but af­ter the 12-year-old com­ment by “best ex­perts” was never backed up by any­thing, I had to jump ship. Where are you pul­ling this stuff?

• I com­pletely don’t un­der­stand this ar­ti­cle, and I’ve been a (rather good) soft­ware de­vel­oper for 10 years. Calcu­la­tors can’t add 200 + 200? What? Huh? I don’t get it.

Their pro­ces­sors are also not us­ing lookup ta­bles. Long ago in the 70′s there was a pro­ces­sor that did that, but it had too many limi­ta­tions.

I have no idea what the hell you’re talk­ing about here.

Also why the fuck is my email ad­dress re­quired? Why do we­blogs do that...

• Also why the fuck is my email ad­dress re­quired? Why do we­blogs do that...

To re­duce the num­ber of fake ac­counts (and there­fore trol­ling and spam).

Anony­mous post­ing (ie with­out a ver­ified email ad­dress) is al­lowed—but can be mod­er­ated sep­a­rately—and more stringently than non-anony­mous ac­counts.

Also, if you only have one email ad­dress and set up an ac­count and then pro­ceed to flame/​troll-bait the blog, your ac­count (and there­fore email ad­dress) can be blocked… and you will no longer be wel­come to post ex­cept by go­ing through the anony­mous chan­nel (which, as I men­tioned be­fore, will be more stringently checked).

In a perfect world ie one that did not con­tain trolls or flamers… such as once ex­isted on ye an­cient olde usenet (hon­est, there was a time when it wasn’t full of trolls!) such tac­tics were not re­quired…

• To re­duce the num­ber of fake ac­counts (and there­fore trol­ling and spam).

Ac­counts were not even re­quired on Over­com­ingBias. From origi­nal post:

But for pur­poses of over­com­ing bias, let us draw two morals:

This is one of the threads that were im­ported to less­wrong. (Per­haps we do not need to re­spond to trol­lish ques­tions that were posted over 2 years ago. :P)

• Maybe so, but the rea­son would have been similar… and 2008 isn’t so olde-dayes ago that ac­counts were un­heard of on blogs. It’s only 2 years. My blog re­quired ac­counts-for-post­ing back then ;)

• Maybe so, but the rea­son would have been similar…

You didn’t need an ac­count. You didn’t need to ver­ify any­thing. ar­gle­bar­gle@floodle­bock.com would have worked.

• This old post led me to an in­ter­est­ing ques­tion: will AI find it­self in the po­si­tion of our fic­tional philoso­phers of ad­di­tion? The ba­sic four func­tions of ar­ith­metic are so fun­da­men­tal to the op­er­a­tion of the digi­tal com­puter that an in­tel­li­gence built on digi­tal cir­cuitry might well have no idea of how it adds num­bers to­gether (un­less told by a com­puter sci­en­tist, of course).

• Bog: You are cor­rect. That is, you do not un­der­stand this ar­ti­cle at all. Pay at­ten­tion to the first word, “Sup­pose...”

We are not talk­ing about how calcu­la­tors are de­signed in re­al­ity. We are dis­cussing how they are de­signed in a hy­po­thet­i­cal world where the mechanism of ar­ith­metic is not well-un­der­stood.

• “Like shoot­ing blind­folded at a dis­tant tar­get”

So long as you know where the tar­get is within five feet, it doesn’t mat­ter how small it is, how far away it is, whether or not you’re blind­folded, or whether or not you even know how to use a bow. You’ll hit it on a nat­u­ral twenty. http://​​www.d20srd.org/​​srd/​​com­bat/​​com­bat­S­tatis­tics.htm#attackRoll

• Log­i­cal fal­lacy of gen­er­al­iza­tion from fic­tional ev­i­dence.

• Damn right. And the same goes for the oft-quoted “mil­lion-to-one chances crop up nine times out of ten”.

Sup­pose that hu­man be­ings had ab­solutely no idea how they performed ar­ith­metic. Imag­ine that hu­man be­ings had evolved, rather than hav­ing learned, the abil­ity to count sheep and add sheep. Peo­ple us­ing this built-in abil­ity have no idea how it worked, the way Aris­to­tle had no idea how his vi­sual cor­tex sup­ported his abil­ity to see things. Peano Arith­metic as we know it has not been in­vented.

It oc­cured to me that a real life ex­am­ple of this kind of thing is gram­mar. I don’t know what the gram­mat­i­cal rules are for which of the words “I” or “me” should be used when I re­fer to my­self, but I can still use those words with perfect gram­mar in ev­ery­day life*. This may be a bet­ter ex­am­ple to use since it’s one that ev­ery­one can re­late to.

*I do use a rule for work­ing out whether I should say “Sarah and I” or “Sarah and me”, but that rule is just “use whichever one you would use if you were just talk­ing about you­self”. Think­ing about it now I can guess at the “I/​me” rule, but there’s plenty of other gram­mar I have no idea about.

• It this thread it­self. He’s com­ment­ing on the top para­graph of the origi­nal post. (It seems like thread necro­mancy at LW is ac­tu­ally very com­mon. It may not be a good term given the nega­tive con­no­ta­tions of necro­mancy for many peo­ple. Maybe thread cry­onic re­vival?)

• I’d ex­pect here we’d give necro­mancy pos­i­tive con­no­ta­tions. Most of the peo­ple here seem to be against death.

I thought it’s only thread necro­mancy if it moves it to the front page. This web­site doesn’t seem to work like that.

I hope it doesn’t work like that, be­cause I posted most of my com­ments on old threads.

• I’d ex­pect here we’d give necro­mancy pos­i­tive con­no­ta­tions. Most of the peo­ple here seem to be against death.

Just be­cause we have a spe­cific at­ti­tude about things doesn’t mean we need to go and use ter­minol­ogy that has pre-ex­ist­ing con­no­ta­tions. I don’t think for ex­am­ple that call­ing cry­on­ics “tech­nolog­i­cal necro­mancy” or “su­per­cold lich­dom” would be helpful to get­ting peo­ple listen al­though both would be awe­some names. How­ever, Eliezer seems to dis­agree at least in re­gards to cry­on­ics in cer­tain nar­row con­texts. See his stan­dard line when peo­ple ask about his cry­onic medal­lion that it is a mark of his mem­ber­ship in the “Cult of the Sev­ered Head.”

There’s ac­tu­ally a gen­eral trend in mod­ern fan­tasy liter­a­ture to see necro­mancy as less in­trin­si­cally evil. The most promi­nent ex­am­ple would be Garth Nix’s “Ab­horsen” tril­ogy and the next most promi­nent would be Gail Martin’s “Chron­i­cles of the Ne­cro­mancer” se­ries. Both have necro­mancers as the main pro­tag­o­nists. How­ever, in this con­text, most of the cached thoughts about death still seem to be pre­sent. In both se­ries, the good necro­mancers use their pow­ers pri­mar­ily to stop evil un­dead and help usher peo­ple in to ac­cept­ing death and the af­ter­life. Some­one should at some point write a fan­tasy novel in which there’s a good necro­mancer who brings peo­ple back as un­dead.

I thought it’s only thread necro­mancy if it moves it to the front page. This web­site doesn’t seem to work like that. I hope it doesn’t work like that, be­cause I posted most of my com­ments on old threads.

Posts only get put to the main page if Eliezer de­cides do so (which he gen­er­ally does to most high rank­ing posts).

• I don’t think for ex­am­ple that call­ing cry­on­ics “tech­nolog­i­cal necro­mancy” or “su­per­cold lich­dom” would be helpful to get­ting peo­ple listen al­though both would be awe­some names.

I dunno—I reckon you might get in­creased in­ter­est from the SF/​F crowd. :)

• Some­one should at some point write a fan­tasy novel in which there’s a good necro­mancer who brings peo­ple back as un­dead.

Funny. I was work­ing on some­thing an awful lot like that back in 2000. I wasn’t ter­ribly good at writ­ing back then, un­for­tu­nately.

• I don’t think for ex­am­ple that call­ing cry­on­ics “tech­nolog­i­cal necro­mancy” or “su­per­cold lich­dom” would be helpful to get­ting peo­ple listen al­though both would be awe­some names.

...or would they...nahh.

• There should be one on what­ever page you’re view­ing my com­ment in (un­less you’re do­ing some­thing un­usual like read­ing this in an rss reader)

• McDer­mott’s old ar­ti­cle, “Ar­tifi­cial In­tel­li­gence and Nat­u­ral Stu­pidity” is a good refer­ence for sug­ges­tively-named to­kens and al­gorithms.

• Some­one needs to teach them how to count: {}, {{}}, {{},{{}}}, {{},{{}},{{},{{}}}}...

• even less es­o­teric: |, ||, |||, ||||, |||||, ….

Then “X” + “Y” = “XY”. For ex­am­ple |||| + ||| = |||||||.

It turns out the difficulty in ad­di­tion is the in­sight that or­di­nals are just an un­friendly rep­re­sen­ta­tion. One needs a map be­tween rep­re­sen­ta­tions in or­der that the ad­di­tion prob­lem be­comes triv­ial.

• when you need to pub­lish at least one pa­per per month

Gah! Any field with a pub­lish­ing re­quire­ment like that… I shud­der.

And… is it me, or is this one of the stupi­dest dis­cus­sion threads on this site?

• “I don’t think we have to wait to scan a whole brain. Neu­ral net­works are just like the hu­man brain, and you can train them to do things with­out know­ing how they do them. We’ll cre­ate pro­grams that will do ar­ith­metic with­out we, our cre­ators, ever un­der­stand­ing how they do ar­ith­metic.”

This sort of anti-pre­dicts the deep learn­ing boom, but only sort of.

Fully con­nected net­works didn’t scale effec­tively; re­searchers had to find (mostly prin­ci­pled, but some ad-hoc) net­work struc­tures that were ca­pa­ble of more effi­ciently learn­ing com­plex pat­terns.

Also, we’ve gen­uinely learned more about vi­sion by re­al­iz­ing the effec­tive­ness of con­volu­tional neu­ral nets.

And yet, the state of the art is to take a gen­er­al­iz­able ar­chi­tec­ture and to scale it mas­sively, not need­ing to know any­thing new about the do­main, nor learn­ing much new about it. So I do think Eliezer loses some Bayes points for his anal­ogy here, as it ap­plies to games and to lan­guage.