The story goes like this: in the beginning, Leibniz and Newton developed calculus using infinitesimals, which were intuitive but had no rigorous foundation (which is to say, ad-hoc). Then, the ϵ−δ calculus was developed, which meant using limits instead, and they had a rigorous foundation. Then, despite much wailing and gnashing of teeth from students ever after, infinitesimals were abandoned for centuries. Suddenly came the 1960s, when Abraham Robinson provided a rigorous formalism for infinitesimals, which gave birth to non-standard (using infinitesimals) analysis to be contrasted with standard (using ϵ−δ) analysis.
So now there is continuous low-grade background fight going on where people do work in non-standard analysis but then have to convert it into standard analysis to get published, and the non-standards say theirs is intuitive and the standards say theirs is formally just as powerful and everyone knows it already so doing it any other way is stupid.
The way this relates to the post is claims like the following, about why non-standard analysis can generate proofs that standard analysis (probably) can’t:
The reason for this confusion is that the set of principles which are accepted by current mathematics, namely ZFC, is much stronger than the set of principles which are actually used in mathematical practice. It has been observed (see [F] and [S]) that almost all results in classical mathematics use methods available in second order arithmetic with appropriate comprehension and choice axiom schemes. This suggests that mathematical practice usually takes place in a conservative extension of some system of second order arithmetic, and that it is difficult to use the higher levels of sets. In this paper we shall consider systems of nonstandard analysis consisting of second order nonstandard arithmetic with saturation principles (which are frequently used in practice in nonstandard arguments). We shall prove that nonstandard analysis (i.e. second order nonstandard arithmetic) with the -saturation axiom scheme has the same strength as third order arithmetic. This shows that in principle there are theorems which can be proved with nonstandard analysis but cannot be proved by the usual standard methods. The problem of finding a specific and mathematically natural example of such a theorem remains open. However, there are several results, particularly in probability theory, whose only known proofs are nonstandard arguments which depend on saturation principles; see, for example, the monograph [Ke]. Experience suggests that it is easier to work with nonstandard objects at a lower level than with sets at a higher level. This underlies the success of nonstandard methods in discovering new results. To sum up, nonstandard analysis still takes place within ZFC, but in practice it uses a larger portion of full ZFC than is used in standard mathematical proofs.
Emphasis mine. This is from On the Strength of Nonstandard Analysis, a 1986 paper by C Ward Henson and H Jerome Keisler. I found the paper and this part of the quote from a StackOverflow answer. Note: you will probably have to use sci-hub unless you have Cambridge access; the find-free-papers browser tools seem to mismatch this with another later paper by Keisler with an almost identical title.
I now treat this as a pretty solid heuristic: when it comes to methods or models, when people say it is intuitive, they mean that it chunks at least some stuff at a lower level of abstraction. Another math case with similar flavor of claim is Hestenes’ Geometric Algebra, which mostly does it by putting the geometric structure at the foundation, which allows humans to use their pretty-good geometric intuition throughout. This pays out by tackling some questions previously reserved for QM with classical methods, among other neat tricks.
For the record I do not know how to do non-standard analysis even a little; I only ever knew what it was because it gets a footnote in electrical engineering as “that thing that let us figure out how to convert between continuous and discrete time.”
I disagree with this particular line, though I don’t think it messes up your general point here (if anything it strengthens it):
The story goes like this: in the beginning, Leibniz and Newton developed calculus using infinitesimals, which were intuitive but had no rigorous foundation (which is to say, ad-hoc).
Part of the point of the post is that ad-hoc-ness is not actually about the presence or absence of rigorous mathematical foundations; it’s about how well the mathematical formulas we’re using match our intuitive concepts. It’s the correspondence to intuitive concepts which tells us how much we should expect the math to generalize to new cases which our intuition says the concept should generalize to. The “arguments” which we want to uniquely specify our formulas are not derivations or proofs from ZFC, they’re intuitive justifications for why we’re choosing these particular definitions.
So I’d actually say that infinitesimals were less ad-hoc, at least at first, than epsilon-delta calculus.
This also highlights an interesting point: ad-hoc-ness and rigorous proofs are orthogonal. It’s possible to have the right formulas for our intuitive concepts before we know exactly what rules and proofs will make it fully rigorous.
Highlighting the difference between ad-hoc-ness and rigor was what I was trying to do when I emphasized that element, though I shoulda put the parentheses between the intuition and rigor section. The implicit assumption I made, which I should probably make explicit, is that if we have something which matches our intuitive concepts well and has a rigorous foundation then I expect it to dominate other options (both in terms of effectiveness and popularity).
Fleshing out the assumption a bit: if you made a 2x2 graph with ad-hoc as the x axis and rigor as the y axis, the upper right quadrant is the good stuff we use all the time; the upper left quadrant is true-but-useless, the bottom left quadrant is ignored completely, and the bottom right quadrant of high ad-hoc but low rigor is where all the action is (in the sense of definitions that might be really useful and adopted in the future).
The infinitesimal vs limits case seems like an example: good intuition match and poor rigor was replaced with acceptable intuition match and good rigor. However, it is a bit messy—I’m zeroing in on the infinitesimals vs limits as methods rather than definitions per se, or something like the presentation of the fundamental theorem of calculus.
I quite separately took the liberty of assuming the same logic you are applying to definitions could be applied to the rest of mathematical architecture, like methods, algorithms, notation, and so on. I admit this introduces quite a bit of fuzz.
This post reminds me of non-standard analysis.
The story goes like this: in the beginning, Leibniz and Newton developed calculus using infinitesimals, which were intuitive but had no rigorous foundation (which is to say, ad-hoc). Then, the ϵ−δ calculus was developed, which meant using limits instead, and they had a rigorous foundation. Then, despite much wailing and gnashing of teeth from students ever after, infinitesimals were abandoned for centuries. Suddenly came the 1960s, when Abraham Robinson provided a rigorous formalism for infinitesimals, which gave birth to non-standard (using infinitesimals) analysis to be contrasted with standard (using ϵ−δ) analysis.
So now there is continuous low-grade background fight going on where people do work in non-standard analysis but then have to convert it into standard analysis to get published, and the non-standards say theirs is intuitive and the standards say theirs is formally just as powerful and everyone knows it already so doing it any other way is stupid.
The way this relates to the post is claims like the following, about why non-standard analysis can generate proofs that standard analysis (probably) can’t:
Emphasis mine. This is from On the Strength of Nonstandard Analysis, a 1986 paper by C Ward Henson and H Jerome Keisler. I found the paper and this part of the quote from a StackOverflow answer. Note: you will probably have to use sci-hub unless you have Cambridge access; the find-free-papers browser tools seem to mismatch this with another later paper by Keisler with an almost identical title.
I now treat this as a pretty solid heuristic: when it comes to methods or models, when people say it is intuitive, they mean that it chunks at least some stuff at a lower level of abstraction. Another math case with similar flavor of claim is Hestenes’ Geometric Algebra, which mostly does it by putting the geometric structure at the foundation, which allows humans to use their pretty-good geometric intuition throughout. This pays out by tackling some questions previously reserved for QM with classical methods, among other neat tricks.
For the record I do not know how to do non-standard analysis even a little; I only ever knew what it was because it gets a footnote in electrical engineering as “that thing that let us figure out how to convert between continuous and discrete time.”
This is a great comment.
I disagree with this particular line, though I don’t think it messes up your general point here (if anything it strengthens it):
Part of the point of the post is that ad-hoc-ness is not actually about the presence or absence of rigorous mathematical foundations; it’s about how well the mathematical formulas we’re using match our intuitive concepts. It’s the correspondence to intuitive concepts which tells us how much we should expect the math to generalize to new cases which our intuition says the concept should generalize to. The “arguments” which we want to uniquely specify our formulas are not derivations or proofs from ZFC, they’re intuitive justifications for why we’re choosing these particular definitions.
So I’d actually say that infinitesimals were less ad-hoc, at least at first, than epsilon-delta calculus.
This also highlights an interesting point: ad-hoc-ness and rigorous proofs are orthogonal. It’s possible to have the right formulas for our intuitive concepts before we know exactly what rules and proofs will make it fully rigorous.
Highlighting the difference between ad-hoc-ness and rigor was what I was trying to do when I emphasized that element, though I shoulda put the parentheses between the intuition and rigor section. The implicit assumption I made, which I should probably make explicit, is that if we have something which matches our intuitive concepts well and has a rigorous foundation then I expect it to dominate other options (both in terms of effectiveness and popularity).
Fleshing out the assumption a bit: if you made a 2x2 graph with ad-hoc as the x axis and rigor as the y axis, the upper right quadrant is the good stuff we use all the time; the upper left quadrant is true-but-useless, the bottom left quadrant is ignored completely, and the bottom right quadrant of high ad-hoc but low rigor is where all the action is (in the sense of definitions that might be really useful and adopted in the future).
The infinitesimal vs limits case seems like an example: good intuition match and poor rigor was replaced with acceptable intuition match and good rigor. However, it is a bit messy—I’m zeroing in on the infinitesimals vs limits as methods rather than definitions per se, or something like the presentation of the fundamental theorem of calculus.
I quite separately took the liberty of assuming the same logic you are applying to definitions could be applied to the rest of mathematical architecture, like methods, algorithms, notation, and so on. I admit this introduces quite a bit of fuzz.