# Beautiful Math

Con­sider the se­quence {1, 4, 9, 16, 25, …} You rec­og­nize these as the square num­bers, the se­quence Ak = k2. Sup­pose you did not rec­og­nize this se­quence at a first glance. Is there any way you could pre­dict the next item in the se­quence? Yes: You could take the first differ­ences, and end up with:

{4 − 1, 9 − 4, 16 − 9, 25 − 16, …} = {3, 5, 7, 9, …}

And if you don’t rec­og­nize these as suc­ces­sive odd num­bers, you are still not defeated; if you pro­duce a table of the sec­ond differ­ences, you will find:

{5 − 3, 7 − 5, 9 − 7, …} = {2, 2, 2, …}

If you can­not rec­og­nize this as the num­ber 2 re­peat­ing, then you’re hope­less.

But if you pre­dict that the next sec­ond differ­ence is also 2, then you can see the next first differ­ence must be 11, and the next item in the origi­nal se­quence must be 36 - which, you soon find out, is cor­rect.

Dig down far enough, and you dis­cover hid­den or­der, un­der­ly­ing struc­ture, sta­ble re­la­tions be­neath chang­ing sur­faces.

The origi­nal se­quence was gen­er­ated by squar­ing suc­ces­sive num­bers—yet we pre­dicted it us­ing what seems like a wholly differ­ent method, one that we could in prin­ci­ple use with­out ever re­al­iz­ing we were gen­er­at­ing the squares. Can you prove the two meth­ods are always equiv­a­lent? - for thus far we have not proven this, but only ven­tured an in­duc­tion. Can you sim­plify the proof so that you can you see it at a glance? - as Polya was fond of ask­ing.

This is a very sim­ple ex­am­ple by mod­ern stan­dards, but it is a very sim­ple ex­am­ple of the sort of thing that math­e­mat­i­ci­ans spend their whole lives look­ing for.

The joy of math­e­mat­ics is in­vent­ing math­e­mat­i­cal ob­jects, and then notic­ing that the math­e­mat­i­cal ob­jects that you just cre­ated have all sorts of won­der­ful prop­er­ties that you never in­ten­tion­ally built into them. It is like build­ing a toaster and then re­al­iz­ing that your in­ven­tion also, for some un­ex­plained rea­son, acts as a rocket jet­pack and MP3 player.

Num­bers, ac­cord­ing to our best guess at his­tory, have been in­vented and rein­vented over the course of time. (Ap­par­ently some ar­ti­facts from 30,000 BC have marks cut that look sus­pi­ciously like tally marks.) But I doubt that a sin­gle one of the hu­man be­ings who in­vented count­ing vi­su­al­ized the em­ploy­ment they would provide to gen­er­a­tions of math­e­mat­i­ci­ans. Or the ex­cite­ment that would some­day sur­round Fer­mat’s Last The­o­rem, or the fac­tor­ing prob­lem in RSA cryp­tog­ra­phy… and yet these are as im­plicit in the defi­ni­tion of the nat­u­ral num­bers, as are the first and sec­ond differ­ence ta­bles im­plicit in the se­quence of squares.

This is what cre­ates the im­pres­sion of a math­e­mat­i­cal uni­verse that is “out there” in Pla­to­nia, a uni­verse which hu­mans are ex­plor­ing rather than cre­at­ing. Our defi­ni­tions tele­port us to var­i­ous lo­ca­tions in Pla­to­nia, but we don’t cre­ate the sur­round­ing en­vi­ron­ment. It seems this way, at least, be­cause we don’t re­mem­ber cre­at­ing all the won­der­ful things we find. The in­ven­tors of the nat­u­ral num­bers tele­ported to Count­ingland, but did not cre­ate it, and later math­e­mat­i­ci­ans spent cen­turies ex­plor­ing Count­ingland and dis­cov­er­ing all sorts of things no one in 30,000 BC could be­gin to imag­ine.

To say that hu­man be­ings “in­vented num­bers”—or in­vented the struc­ture im­plicit in num­bers—seems like claiming that Neil Arm­strong hand-crafted the Moon. The uni­verse ex­isted be­fore there were any sen­tient be­ings to ob­serve it, which im­plies that physics pre­ceded physi­cists. This is a puz­zle, I know; but if you claim the physi­cists came first, it is even more con­fus­ing be­cause in­stan­ti­at­ing a physi­cist takes quite a lot of physics. Physics in­volves math, so math—or at least that por­tion of math which is con­tained in physics—must have pre­ceded math­e­mat­i­ci­ans. Other­wise, there would have no struc­tured uni­verse run­ning long enough for in­nu­mer­ate or­ganisms to evolve for the billions of years re­quired to pro­duce math­e­mat­i­ci­ans.

The amaz­ing thing is that math is a game with­out a de­signer, and yet it is em­i­nently playable.

Oh, and to prove that the pat­tern in the differ­ence ta­bles always holds:

(k + 1)2 = k2 + (2k + 1)

As for see­ing it at a glance:

Think the square prob­lem is too triv­ial to be worth your at­ten­tion? Think there’s noth­ing amaz­ing about the ta­bles of first and sec­ond differ­ences? Think it’s so ob­vi­ously im­plicit in the squares as to not count as a sep­a­rate dis­cov­ery? Then con­sider the cubes:

1, 8, 27, 64...

Now, with­out calcu­lat­ing it di­rectly, and with­out do­ing any alge­bra, can you see at a glance what the cubes’ third differ­ences must be?

And of course, when you know what the cubes’ third differ­ence is, you will re­al­ize that it could not pos­si­bly have been any­thing else...

• Back in high school I dis­cov­ered this by ac­ci­dent (yes, I was re­ally bored!). I sup­pose it’s noth­ing new, but it turns out that this works for more than sim­ple squares and cubes:

Given any se­quence of num­bers, keep find­ing differ­ences of differ­ences un­til you hit a con­stant; the num­ber of iter­a­tions needed is the max­i­mum ex­po­nent in the for­mula that pro­duced the num­bers. That is, this works even if there are other terms, re­gard­less of whether any or all terms have co­effi­cients other than 1.

• This is ob­vi­ous af­ter you learn calcu­lus. The “nth differ­ence” cor­re­sponds to nth deriva­tive (a se­quence just looks at in­te­ger points of a real-val­ued func­tion), so clearly a polyno­mial of de­gree n has con­stant nth deriva­tive. It would be even more ac­cu­rate to say that an nth an­tideriva­tive of a con­stant is pre­cisely a de­gree n polyno­mial.

• Differ­ences and deriva­tives are not the same, though there is the ob­vi­ous anal­ogy. If you want to take deriva­tives and an­tideriva­tives, you want to write in the x^k ba­sis or the x^k/​k! ba­sis. If you want to take differ­ences and sums, you want to write in the fal­ling fac­to­rial ba­sis or the x choose k ba­sis.

• If you get a non con­stant, yes. For a lin­ear func­tion, f(a+1) - f(a) = f’(a). In­duc­tively you can then show that the nth one-step differ­ence of a de­gree n polyno­mial f at a point a is f^(n)(a). But this doesn’t work for any­thing but n. Thanks for point­ing that out!

• Ah, yes, that’s a good point, be­cause the lead­ing co­effi­cient be the same whether you use the x^k ba­sis or the fal­ling fac­to­rial ba­sis.

• Nei­ther finite differ­ences nor calcu­lus are new to me, but I didn’t pick up the cor­re­la­tion be­tween the two un­til now, and it re­ally is ob­vi­ous.

This is why I love math­e­mat­ics—there’s always a trick hid­den up the sleeve!

• No­tice that the re­sult doesn’t hold if the points aren’t evenly spaced, so the solu­tion must use this fact.

• Iter­ated finite differ­ences cor­re­spond to deriva­tives in some non-ob­vi­ous way I can’t re­mem­ber (and can’t be both­ered to find out).

• So did I! And in gen­eral the nth or­der finite differ­ences of nth pow­ers will be n fac­to­rial.

• Really for non-polyno­mi­als, and I think that was im­plied by the phras­ing.

• I agree that it’s im­plied by work­ing out the logic and find­ing that it doesn’t ap­ply el­se­where. I dis­agree that it is im­plied by the phras­ing.

Given any se­quence of numbers

doesn’t seem to re­strict it, and though I suppose

the num­ber of iter­a­tions needed is the max­i­mum ex­po­nent in the for­mula that pro­duced the numbers

im­plies that there is a “max­i­mum ex­po­nent in the for­mula” and with slightly more rea­son­ing (a num­ber of iter­a­tions isn’t go­ing to be frac­tional) that it must be a for­mula with a whole num­ber max­i­mum ex­po­nent, I don’t see any­thing that pre­cludes, for in­stance, x^2 + x^(1/​2), which would also never go con­stant.

• Sorry, I was us­ing the weak “im­plies”, and prob­a­bly too much char­ity.

And I usu­ally only look at this sort of thing in the con­text of al­gorithm anal­y­sis, so I’m used to think­ing that x squared is pretty much equal to 5 x squared plus 2 log x plus square root of x plus 37.

• None of this was re­ally new con­tent to me, and yet I en­joyed read­ing it im­mensely. It must be what go­ing to church is some­times like for a the­ist, the re­as­surance hear­ing a ser­mon on beau­tiful things you already know. :-)

But what was the bias to be over­come? Math­e­mat­i­cal pla­ton­ism?

• Any se­quence of num­bers Ak = f(k), where f(k) is a polyno­mial of de­gree n, will have its nth differ­ences a con­stant. This is the method of “finite differ­ences”; in fact, tak­ing the differ­ences of a se­quence of num­bers is roughly analo­gous to differ­en­ti­a­tion, and tak­ing par­tial sums is analo­gous to in­te­gra­tion.

It’s an in­ter­est­ing fact about the way math­e­mat­ics has his­tor­i­cally de­vel­oped that the analo­gous state­ment about polyno­mi­als viewed as func­tions of real num­bers seems much more ob­vi­ous to most peo­ple that have some math­e­mat­i­cal train­ing.

• I love math. It’s the only rea­son I some­times wish I’d stayed in school. When I get rich, I want to hire a math­e­mat­i­cian to live in my base­ment and tu­tor me. I bet they can be had for cheap.

Pure math is po­ten­tially a perfect idea. Ap­plied math; not so much. When you see that line of 2′s, how do you know it con­tinues for­ever? You don’t. You’re mak­ing an in­duc­tion; a beau­tiful guess. It’s only be­cause you peeked at the real an­swer—an an­swer you your­self cre­ated—that you can con­fi­dently say that you “pre­dicted” the se­quence with your method.

I’m much more in­ter­ested in se­quences pro­duced in a sim­ple de­ter­minis­tic way that are ex­tremely difficult to crack. The move from “it makes no sense” to “it’s ob­vi­ous” is a crit­i­cal dy­namic in hu­man thought. I’d like to see you write about that.

As Polya would say, solv­ing these prob­lems is a heuris­tic pro­cess. The rea­son you think you find or­der when you dig down far enough is that you sys­tem­at­i­cally ig­nore any situ­a­tion where you don’t find or­der. Your cat­e­gories have or­der built into them. You are drawn to or­der. There are prob­a­bly a host of bi­ases in­fluenc­ing that: availa­bil­ity, on­tol­ogy, in­stru­men­tal­ism, and hind­sight among them.

There’s lots of or­der to be found. There is also in­finite amounts of di­s­or­der, un­prov­able or­der, and al­ter­nate plau­si­ble or­der. Oc­cam’s ra­zor helps sort it out—that’s also a heuris­tic.

• Thanks for the beauty, it feels good. Some think­ing out loud. I can’t help but feel that the key is in the suc­ces­sive lay­ers of maps and ter­ri­to­ries : maths is (or con­tains) the map of which physics is the ter­ri­tory, physics is the map of which ‘the real world’ is the ter­ri­tory, ‘the real world’ is the map our brains cre­ate from the sen­sory in­put con­cern­ing the ter­ri­tory which is the ‘play of en­er­gies’ out there, while that in it­self is an­other map. Antony Gar­rett Lisi’s pro­posal, as an ex­am­ple, would be the most el­e­gant meta-map yet. What these maps have in com­m­mon is : be­ing cre­ated by the hu­man brain, a wet lump of ner­vous tis­sue com­pris­ing ad-hoc pur­pose spe­cific mod­ules. It has spe­cific ways of mak­ing maps, so small won­der all these lay­ers of maps are co­her­ent. Now if the ‘math­e­mat­ics’ layer of maps has un­fore­seen and self-con­sis­tent prop­er­ties, it could be just a man­i­fes­ta­tion of the na­ture of our map-mak­ing mod­ules : they are rules driven. So, is the Uni­verse a ge­o­met­ric figure cor­re­spond­ing to a Lie E8 group, or does that just hap­pen to be the way the hu­man brain is built to in­ter­pret things ?

• So… Should it be ob­vi­ous that the nth differ­ence is n fac­to­rial? I’m afraid I don’t have in­tu­itive knowl­edge of all the prop­er­ties of the fac­to­rial func­tion.

• To go even fur­ther, if you take the se­quence A[k] = k^n and, for each n, take differ­ences un­til you reach a con­stant, then the list A[n] of those con­stants is n fac­to­rial (n!).

• This re­minds me of when I first started learn­ing about topolog­i­cal spaces, and then we added a met­ric, sud­denly all the the­o­rems and lem­mas we’d had to prove in fiddly ways with lots of ep­silons and deltas in first year anal­y­sis were blind­ingly and beau­tifully ob­vi­ous. The sheer glo­ri­ous in­ter­con­nect­ed­ness was so over­whelming that I very nearly or­gasmed in the lec­ture the­atre!

• Josh, if you think about a pic­ture like the one Eliezer drew (but in how­ever many di­men­sions you like) it’s kinda ob­vi­ous that the lead­ing term in the differ­ence be­tween two n-cubes con­sists of n (n-1)-cubes, one per di­men­sion. So the lead­ing term in the next differ­ence is n(n-1) (n-2)-cubes, and so on. But that doesn’t re­ally give the n! thing at a glance. I’m not con­vinced that any­thing to do with nth differ­ences can re­ally be seen at a glance with­out some more sym­bolic rea­son­ing in­ter­ven­ing.

James Bach, I sus­pect that the re­ally good math­e­mat­i­ci­ans can’t be had for cheap be­cause do­ing math­e­mat­ics is so im­por­tant to them, and the quite good math­e­mat­i­ci­ans can’t be had for cheap be­cause they’ve taken high-pay­ing jobs in fi­nance or soft­ware or other do­mains where a math­e­mat­i­cal mind is use­ful. But maybe it de­pends on what you count as “cheap” and what frac­tion of the math­e­mat­i­cian’s time you want to take up with tu­tor­ing...

Is­abel, I think per­haps differ­en­ti­a­tion re­ally is eas­ier in some sense than differenc­ing, not least be­cause the for­mu­lae are sim­pler. Maybe that stops be­ing true if you take as your ba­sic ob­jects not n^k but n(n-1)...(n-k+1) or some­thing, but it’s hard to see the feel­ing that n^k is sim­pler than that as mere his­tor­i­cal ac­ci­dent.

• Martin Gard­ner has a chap­ter on these “look-see” proofs in Knot­ted Donuts.

• The beauty of it all never fails to shock me, and I’m very much a novice. ‘The Road To Real­ity’ is sit­ting on my shelves look­ing darkly at me, wait­ing to be at­tacked for the first time. I can only imag­ine what be­ing at the cut­ting edge must be like.

But maybe it de­pends on what you count as “cheap” and what frac­tion of the math­e­mat­i­cian’s time you want to take up with tu­tor­ing...

Such a math­e­mat­i­cian’s an­swer.… ;)

• Carl Lin­der­holm notes in Math­e­mat­ics Made Difficult that the next num­ber in 1 2 4 8 16 ? has to be 31, based on just those differ­ences.

Lautrea­mont on math­e­mat­i­cal ob­jects.

• Quote from a friend: “Math­e­mat­i­ci­ans are Pla­ton­ists pre­tend­ing to be for­mal­ists.”

Words that ring true in ev­ery math de­part­ment I’ve ever been to...

• “To say that hu­man be­ings “in­vented num­bers”—or in­vented the struc­ture im­plicit in num­bers—seems like claiming that Neil Arm­strong hand-crafted the Moon. The uni­verse ex­isted be­fore there were any sen­tient be­ings to ob­serve it, which im­plies that physics pre­ceded physi­cists.”

No, there’s a con­fla­tion of two things here.

Have you ever re­ally looked at a penny? I’m look­ing at a 1990 penny now. I know that if you look at the front and you see the bas-re­lief of Lin­coln, and the date 1990, and it’s a penny, then you can be sure that the back side will have a pic­ture of the Lin­coln memo­rial. It works! And you can find all sorts of con­nec­tions. Like, there’s a sin­gle “O” on the front, in the name GOD in the phrase IN GOD WE TRUST. And there’s a sin­gle “O” on the back, in the phrase ONE CENT. One O on the front, one O on the back. A con­nec­tion! You could make lots and lots of these in­ter­con­nec­tions be­tween the front and the back of the penny, and draw con­clu­sions about what it means. You could in­vent a dis­ci­pline of Pen­ny­ol­ogy if only some­body would fund it.

Is it true that Pen­ny­ol­ogy is im­plicit in pen­nies? In a way. Cer­tainly the pen­nies should ex­ist be­fore the Pen­ny­ol­ogy. But the pen­nies are only what­ever they are. The ex­is­tence of pen­nies doesn’t tell us much about what the prac­ti­tion­ers of the dis­ci­pline of Pen­ny­ol­ogy will ac­tu­ally no­tice. They might never pay at­ten­tion to the pair of O’s. There could be a fold in Lin­coln’s coat that af­ter the proper anal­y­sis pro­vides a solu­tion to the whole world crisis, and they may never pick up on it. While it’s pre­dictable that differ­ent in­de­pen­dent at­tempts at Pen­ny­ol­ogy would have a whole lot in com­mon since af­ter all they all need to be com­pat­i­ble with the same pen­nies, still they might be very differ­ent in some re­spects. You can’t nec­es­sar­ily pre­dict the Pen­ny­ol­ogy from look­ing at the penny. And you can’t pre­dict what math­e­mat­ics peo­ple will in­vent from ob­serv­ing re­al­ity.

You can pre­dict some things. A math­e­mat­ics that in­vents the same 2D plane we use and that proves a 3-color the­o­rem has some­thing wrong with it. But you can’t pre­dict which things will be found first or, to some ex­tent, which things will be found at all.

If there’s a re­al­ity that math­e­mat­ics must con­form to, still each in­di­vi­d­ual ver­sion of hu­man math­e­mat­ics is in­vented by hu­mans.

Similarly with physics. Our physics is in­vented. The re­al­ity the physics de­scribes is real. We can imag­ine a pla­tonic-ideal physics that fit the re­al­ity com­pletely, but we don’t have an ex­am­ple of that to point at. So for ex­am­ple be­fore Townsend in­vented the laser, a num­ber of great physi­cists claimed it was im­pos­si­ble. Townsend got the idea be­cause lasers could be de­scribed us­ing Maxwell’s equa­tions. But peo­ple thought that quan­tum me­chan­ics pro­vided no way to get that re­sult. it turned out they were wrong.

Ac­tual physics is in­vented. Cer­tainly in­cor­rect physics must be in­vented. There’s noth­ing in re­al­ity that shows you how to do physics wrong.

• J Thomas—thought ex­per­i­ment: how differ­ent could alien math­e­mat­ics pos­si­bly be?

• I don’t see how Pen­ny­ol­ogy proves your point: a pen­ny­ol­o­gist dis­cov­ers that there’s an O on the front and an O on the back. Per­haps he in­vents mean­ing to at­tach to this fact, but the fact was first dis­cov­ered.

Or is that what you were say­ing: that we dis­cover math­e­mat­i­cal facts, but in fact what we call “math­e­mat­ics” is the in­ven­tion of mean­ing that we at­tach to cer­tain fa­vored facts?

• “So… Should it be ob­vi­ous that the nth differ­ence is n fac­to­rial? I’m afraid I don’t have in­tu­itive knowl­edge of all the prop­er­ties of the fac­to­rial func­tion.”

I don’t re­mem­ber writ­ing this. Did some­body else write this us­ing the same han­dle as me or did I write this and for­get it?

• “I don’t re­mem­ber writ­ing this. Did some­body else write this us­ing the same han­dle as me or did I write this and for­get it?”

My son’s name is also Bort.

• Chris and Ben, we cre­ate ax­iom sys­tems and we dis­cover parts of “math­e­mat­ics”. There are prob­a­bly only a finite num­ber of the­o­rems that can be stated with only 10 char­ac­ters, or 20 char­ac­ters, or 30 char­ac­ters, pro­vided we don’t add new defi­ni­tions. But the num­ber of pos­si­ble the­o­rems quickly gets very very large. Will each in­de­pen­dent group of math­e­mat­i­ci­ans come up with the same the­o­rems? Prob­a­bly not. So we get differ­ent math­e­mat­ics.

How differ­ent could alien math­e­mat­ics be? I don’t know. We could look at a va­ri­ety of alien math­e­mat­ics and see. Ex­cept, we don’t have much of that. We pre­sum­ably had differ­ent math­e­mat­i­cal tra­di­tions in china, in­dia, and eu­rope, and we got some minor differ­ences. But they were solv­ing similar real-life prob­lems and they could have been in com­mu­ni­ca­tion. If you want trade in silk then you need a lot of it for it to make much differ­ence. A very few math­e­mat­i­ci­ans trav­el­ing could spread ideas eas­ily.

It’s easy to see al­ter­nate physics, and al­ter­nate tech­nol­ogy is kind of ar­bi­trary. I think alien math might be pretty differ­ent de­pend­ing on which the­o­rems they proved first. But I don’t have good ex­am­ples to demon­strate it.

The Pen­ny­ol­o­gist who no­tices the O’s is not that differ­ent from the Pen­ny­ol­o­gist who no­tices there’s one L on each side. A par­tic­u­lar Pen­ny­ol­ogy might no­tice one of those, or the other one, or both. Out of the many re­la­tion­ships you could pick out from the penny, which ones will peo­ple pay at­ten­tion to?

• Beau­tiful post, thanks :-)

• The differ­ences method is of course strongly re­lated to differ­en­tial calcu­lus. The first differ­ence cor­re­sponds to the first deriva­tive, etc. And just like in differ­ences, the nth deriva­tive of a x^n term is sure enough a con­stant.

• Eliezer some­time ask some­thing that I would now like to ask him: how would the world looks like if math­e­mat­ics didn’t pre­cede math­e­mat­i­ci­ans? And if it did?