Formalized math: dream vs reality

(Disclaimer: this post is intended as an enthusiastic introduction to a topic I knew absolutely nothing about till yesterday. Correct me harshly as needed.)

We programmers like to instantly pigeonhole problems as “trivial” and “impossible”. The AI-Box has been proposed as candidate for the simplest “impossible” problem. Today I’d like to talk about a genuinely hard problem that many people quickly file as “trivial”: formalizing mathematics. Not as in, teach the computer to devise and prove novel conjectures. More like, type in a proof for some simple mathematical fact, e.g. the irrationality of the square root of 2, and get the computer to verify it.

Now, if you’re unfamiliar with the entire field, what’s your best guess as to the length of the proof? I’ll grant you the generous assumption that the verifier already has an extensive library of axioms, lemmas and theorems in elementary math, just not this particular fact. Can you do it in one line? Two lines? (Here’s a one-liner for humans: when p/​q is in lowest terms, p2/​q2 is also in lowest terms and hence cannot reduce to 2.)

While knowledgeable readers chuckle to themselves and the rest frantically calibrate, I will take a moment to talk about the dream...

Since the cutting edge of math will obviously always outrun any codification effort, why formalize proofs at all? Robert S. Boyer (of Boyer-Moore fame) spelled out his vision of the benefits in the QED Manifesto in 1993: briefly, a machine-checkable repository of humanity’s mathematical knowledge could help math progress in all sorts of interesting ways, from educating kids to publishing novel results. It would be like a good twin of Cyc where the promises actually make sense.

But there could be more than that. If you’re a programmer like me, what word first comes to your mind upon imagining such a repository? Duh, “refactoring”. Our industry has a long and successful history of using tools for automated refactorings, mechanically identifying opportunities to extract common code and then shove it around with guaranteed conservation of program behavior. Judging from our experiences in shortening large ad-hoc codebases, this could imply a radical simplification in all areas of math at once!

Import Group Theory. Analyze. Duplicate Code Spotted: 119 Instances. Extract? Yes or No.

...Or so goes the dream.

You ready with that estimate? Here’s a page comparing formal proofs that sqrt(2) is irrational for 17 different proof checkers. The typical short proof (e.g. Mizar) runs about a page long, but a lot of its complexity is hidden in innocent-looking “by” clauses or “tactics” that are, in effect, calls to a sophisticated formula-matching algorithm “prove this from that”, so you can’t exactly verify this kind of proof by hand the way the software does it. The more explicit proofs are many times longer. And yes, all of those were allowed to use the verifiers’ standard libraries, weighing in at several megabytes in some cases.

Things aren’t so gloomy. We haven’t won, but we’re making definite progress. We don’t have Atiyah-Singer yet, but the Prime number theorem is done and so is the Jordan curve, so the world’s formalized math knowledge already contains proofs I can’t recite offhand. Still, the situation looks as if a few wannabe AI researchers could substantially help humanity by putting their mind to the proof-encoding problem instead; it’s obviously much easier than general AI, but lies in the same rough direction.

In conclusion, ponder this: human beings have no evolutionary reason to be good at math. (As opposed to, say, hunting or deceiving each other.) Chances are that, on an absolute scale, we suck about as hard as it’s possible to suck and still be called “mathematicians”: what do you mean, you can’t do 32-bit multiplication in your head? It’s still quite conceivable that a relatively dumb prune-search program, not your ominous paperclipper, can beat humans at math as decisively as they beat us at chess.