As I learn mathematics I try to deeply question everything, and pay attention to which assumptions are really necessary for the results that we care about. Over time I have accumulated a bunch of “hot takes” or opinions about how conventional math should be done differently. I essentially never have time to fully work out whether these takes end up with consistent alternative theories, but I keep them around.
In this quick-takes post, I’m just going to really quickly write out my thoughts about one of these hot takes. That’s because I’m doing Inkhaven and am very tired and wish to go to sleep. Please point out all of my mistakes politely.
The classic methods of defining numbers (naturals, integers, rationals, algebraic, reals, complex) are “wrong” in the sense that it doesn’t match how people actually think about numbers (correctly) in their heads. That is to say, it doesn’t match the epistemically most natural conceptualization of them: the one that carves nature at its joints.
For example, addition and multiplication are not two equally basic operations that just so happen to be related through the distributivity property, forming a ring. Instead, multiplication is repeated addition. It’s a theorem that repeated addition is commutative. Similarly, exponentiation is repeated multiplication. You can keep defining repeated operations, resulting in the hyperoperator. I think this is natural, but I’ve never taken a math class or read a textbook that talked about the hyperoperators. (If they do, it will be via the much less natural version that is the Ackermann function.)
This actually goes backwards one more step; addition is repeated “add 1”. Associativity is an axiom, and commutativity of addition is a theorem. You start with 1 as the only number. Zero is not a natural number, and comes from the next step.
The negative numbers are not the “additive inverse”. You get the negatives (epistemically) by deciding you want to work with solutions to all equations of the form a+x=b for naturals a and b. The fact that declaring these objects to exist is consistent should be a theorem, as should the fact that some of the solutions to different equations are equal (e.g. that the solution to 2+x=1 is the same as the solution to 3+x=2).
This idea is also iterated up through the hyperoperations. The rational numbers are (again, epistemically) the set of all solutions to the equations a∗x=b where a and b are integers. The fact that this set is not consistent when a is 0 should also be a theorem.
Since the third-degree hyperoperation ab is not commutative, you can demand two new types of solutions, those of xa=b and those of ax=b. This gives you roots and logs. The fact that roots require you to define the imaginary numbers, and therefore lose the total ordering over the numbers, should be a theorem.
Some of the solutions to these equations are numbers we already had, and some of them are new numbers that we didn’t have This setup leads to the natural question of whether we will keep on producing new types of numbers or not. The complex numbers are special in part because they are closed under all these inverse operations. But do we need new types of numbers if we demand solutions to the fourth-order hyperoperator? What happens if we go all the way up? I have no idea.
My guess would be that we actually want to view there as being multiple basic/intuitive cognitive starting points, and they’d correspond to different formal models. As an example, consider steps / walking. It’s pretty intuitive that if you’re on a straight path, facing in one fixed direction, there’s two types of actions—walk forward a step, walk backward a step—and that these cancel out. This corresponds to addition and subtraction, or addition of positive numbers and addition of negative numbers. In this case, I would say that it’s a bit closer to the intuitive picture if we say that “take 3 steps backward” is an action, and doing actions one after the other is addition, and so that action would be the object “-3″; and then you get the integers. I think there just are multiple overlapping ways to think of this, including multiple basic intuitive ones. This is a strange phenomenon, one which Sam has pointed out. I would say it’s kinda similar to how sometimes you can refactor a codebase infinitely, or rather, there’s several different systemic ways to factor it, and they are each individually coherent and useful for some niche, but there’s not necessarily a clear way to just get one system that has all the goodnesses of all of them and is also a single coherent system. (Or maybe there is, IDK. Or maybe there’s some elegant way to have it all.)
Another example might be “addition as combining two continuous quantities” (e.g. adding some liquid to some other liquid, or concatenating two lengths). In this case, the unit is NOT basic, and the basic intuition is of pure quantity; so we really start with R.
Were you (or others here) not introduced to multiplication as repeated addition and exponentiation as repeated multiplication? How was it introduced to you? I don’t remember if I was taught this in school, but I viewed the commutativity of addition/multiplication geometrically: addition through the lens of stacking “sticks” of different lengths together and multiplication as area.
When I was in middle school I was also obsessed with higher operations and begun to accelerate my own math journey intending to conduct research in that field. I was also surprised to see so little work done there. Turns out it’s just an ugly area of math (compared to others) and I stopped really thinking about. But I don’t regret the time I spent discovering “theorems” and whatever and encourage you to do the same. I’ll bet in time you’ll reverse your opinions here, but who knows.
For your last paragraph: consider looking into how one might even define tetration at fractional hyper-powers. That’s the “easiest” case but it’s already non-trivial!
The usual definition of numbers in lambda calculus is closer to what you want; numbers are iterators, which given a zero z and a function f, iterate f some number of times. I played around with defining numbers in lambda calculus plus an iota operator. (iota [p]) returns a term t such that (p t) = true, if such a term exists. (“true” is just λxy.x) This allows us to define negative numbers as the things that inverse-iterate, define imaginary numbers, etc, all in one simple formalism.
iota plus lambda allows for higher-order logic already (∃x.p(x) is just p(ιp); ∃p.px is just [λp(px)](ιλp(px))) so there are definitely questions of consistency. However, it feels like some suitable version of this could be a really pleasing foundation close to what you had in mind.
The distributivity property is closely related to multiplication being repeated addition. If you break one of the numbers apart into a sum of 1s and then distribute over the sum, you get repeated addition.
It sounds like you might be looking for Peano’s axioms for arithmetic (which essentially formalize addition as being repeated “add 1” and multiplication as being repeated addition) or perhaps explicit constructions of various number systems (like those described here).
The drawback of these definitions is that they don’t properly situate these numbers systems as “core” examples of rings. For example, one way to define the integers is to first define a ring and then define the integers to be the “smallest” or “simplest” ring (formally: the initial object in the category of rings). From this, you can deduce that all integers can be formed by repeatedly summing 1s or −1s (else you could make a smaller ring by getting rid of the elements that aren’t sums of 1s and −1s) and that multiplication is repeated addition (because a⋅b=a⋅(1+⋯+1)=a+⋯+a where there are b terms in these sums).
(It’s worth noting that it’s not the case in all rings that multiplication, addition, and “plus 1” are related in these ways. E.g. it would be rough to argue that if A and B are matrices then the product AB corresponds to summing A with itself B times. So I think it’s a reasonable perspective that multiplication and addition are independent “in general” but the simplicity of the integers forces them to be intertwined).
Some other notes:
Defining −a to be the additive inverse of a is the same as defining it as the solution to a+x=0. No matter which approach you take, you need to prove the same theorems to show that the notion makes sense (e.g. you need to prove that −a+−b=−(a+b)).
Similarly, taking Q to be the field of fractions of Z is equivalent to adding insisting that all equations ax=b have a solution, and the set of theorems you need to prove to make sure this is reasonable are the same.
In general, note that giving a definition doesn’t mean that there’s actually any object that actually satisfies that definition. E.g. I can perfectly well define α to be an integer such that α⋅0=3 but I would still need to prove that there exists such an integer α. No matter how you define the integers, rational numbers, etc., you need to prove that there exists a set and some operations that satisfy that definition. Proving this typically requires giving a construction along the lines of what you seem to be looking for. So these definitions aren’t really meant to be a substitute for constructing models of the number systems.
This makes me think of eurisko/automated mathematicion, and wonder what minimal set of heuristic and concepts you can start with to get to higher math.
I disagree maths “should be” done differently. I have a strong feeling the way stuff is defined usually nowadays has a property of being maximally easy to use. We don’t really need the definitions to look exactly like the intuition we had to invent them as long as the resulting objects behave exactly the same, and the less intuitive definitions are easier to use in proofs. For example, defining all powers directly as the Taylor series of e^x makes defining complex and matrix exponentials much easier / possible at all, and ad hoc proof this coincides with the naive version is simple. Also simplifies checking well-definedness a lot. Many more such examples.
A hot math take
As I learn mathematics I try to deeply question everything, and pay attention to which assumptions are really necessary for the results that we care about. Over time I have accumulated a bunch of “hot takes” or opinions about how conventional math should be done differently. I essentially never have time to fully work out whether these takes end up with consistent alternative theories, but I keep them around.
In this quick-takes post, I’m just going to really quickly write out my thoughts about one of these hot takes. That’s because I’m doing Inkhaven and am very tired and wish to go to sleep. Please point out all of my mistakes politely.
The classic methods of defining numbers (naturals, integers, rationals, algebraic, reals, complex) are “wrong” in the sense that it doesn’t match how people actually think about numbers (correctly) in their heads. That is to say, it doesn’t match the epistemically most natural conceptualization of them: the one that carves nature at its joints.
For example, addition and multiplication are not two equally basic operations that just so happen to be related through the distributivity property, forming a ring. Instead, multiplication is repeated addition. It’s a theorem that repeated addition is commutative. Similarly, exponentiation is repeated multiplication. You can keep defining repeated operations, resulting in the hyperoperator. I think this is natural, but I’ve never taken a math class or read a textbook that talked about the hyperoperators. (If they do, it will be via the much less natural version that is the Ackermann function.)
This actually goes backwards one more step; addition is repeated “add 1”. Associativity is an axiom, and commutativity of addition is a theorem. You start with 1 as the only number. Zero is not a natural number, and comes from the next step.
The negative numbers are not the “additive inverse”. You get the negatives (epistemically) by deciding you want to work with solutions to all equations of the form a+x=b for naturals a and b. The fact that declaring these objects to exist is consistent should be a theorem, as should the fact that some of the solutions to different equations are equal (e.g. that the solution to 2+x=1 is the same as the solution to 3+x=2).
This idea is also iterated up through the hyperoperations. The rational numbers are (again, epistemically) the set of all solutions to the equations a∗x=b where a and b are integers. The fact that this set is not consistent when a is 0 should also be a theorem.
Since the third-degree hyperoperation ab is not commutative, you can demand two new types of solutions, those of xa=b and those of ax=b. This gives you roots and logs. The fact that roots require you to define the imaginary numbers, and therefore lose the total ordering over the numbers, should be a theorem.
Some of the solutions to these equations are numbers we already had, and some of them are new numbers that we didn’t have This setup leads to the natural question of whether we will keep on producing new types of numbers or not. The complex numbers are special in part because they are closed under all these inverse operations. But do we need new types of numbers if we demand solutions to the fourth-order hyperoperator? What happens if we go all the way up? I have no idea.
Peano Arithmetic and ZFC pretty much do define addition and multiplication recursively in terms of successor and addition, respectively.
My guess would be that we actually want to view there as being multiple basic/intuitive cognitive starting points, and they’d correspond to different formal models. As an example, consider steps / walking. It’s pretty intuitive that if you’re on a straight path, facing in one fixed direction, there’s two types of actions—walk forward a step, walk backward a step—and that these cancel out. This corresponds to addition and subtraction, or addition of positive numbers and addition of negative numbers. In this case, I would say that it’s a bit closer to the intuitive picture if we say that “take 3 steps backward” is an action, and doing actions one after the other is addition, and so that action would be the object “-3″; and then you get the integers. I think there just are multiple overlapping ways to think of this, including multiple basic intuitive ones. This is a strange phenomenon, one which Sam has pointed out. I would say it’s kinda similar to how sometimes you can refactor a codebase infinitely, or rather, there’s several different systemic ways to factor it, and they are each individually coherent and useful for some niche, but there’s not necessarily a clear way to just get one system that has all the goodnesses of all of them and is also a single coherent system. (Or maybe there is, IDK. Or maybe there’s some elegant way to have it all.)
Another example might be “addition as combining two continuous quantities” (e.g. adding some liquid to some other liquid, or concatenating two lengths). In this case, the unit is NOT basic, and the basic intuition is of pure quantity; so we really start with R.
Were you (or others here) not introduced to multiplication as repeated addition and exponentiation as repeated multiplication? How was it introduced to you? I don’t remember if I was taught this in school, but I viewed the commutativity of addition/multiplication geometrically: addition through the lens of stacking “sticks” of different lengths together and multiplication as area.
When I was in middle school I was also obsessed with higher operations and begun to accelerate my own math journey intending to conduct research in that field. I was also surprised to see so little work done there. Turns out it’s just an ugly area of math (compared to others) and I stopped really thinking about. But I don’t regret the time I spent discovering “theorems” and whatever and encourage you to do the same. I’ll bet in time you’ll reverse your opinions here, but who knows.
For your last paragraph: consider looking into how one might even define tetration at fractional hyper-powers. That’s the “easiest” case but it’s already non-trivial!
Clearly OP was introduced to addition and multiplication as the coproduct and product in the category set.
The usual definition of numbers in lambda calculus is closer to what you want; numbers are iterators, which given a zero z and a function f, iterate f some number of times. I played around with defining numbers in lambda calculus plus an iota operator. (iota [p]) returns a term t such that (p t) = true, if such a term exists. (“true” is just λxy.x) This allows us to define negative numbers as the things that inverse-iterate, define imaginary numbers, etc, all in one simple formalism.
iota plus lambda allows for higher-order logic already (∃x.p(x) is just p(ιp); ∃p.px is just [λp(px)](ιλp(px))) so there are definitely questions of consistency. However, it feels like some suitable version of this could be a really pleasing foundation close to what you had in mind.
The distributivity property is closely related to multiplication being repeated addition. If you break one of the numbers apart into a sum of 1s and then distribute over the sum, you get repeated addition.
It sounds like you might be looking for Peano’s axioms for arithmetic (which essentially formalize addition as being repeated “add 1” and multiplication as being repeated addition) or perhaps explicit constructions of various number systems (like those described here).
The drawback of these definitions is that they don’t properly situate these numbers systems as “core” examples of rings. For example, one way to define the integers is to first define a ring and then define the integers to be the “smallest” or “simplest” ring (formally: the initial object in the category of rings). From this, you can deduce that all integers can be formed by repeatedly summing 1s or −1s (else you could make a smaller ring by getting rid of the elements that aren’t sums of 1s and −1s) and that multiplication is repeated addition (because a⋅b=a⋅(1+⋯+1)=a+⋯+a where there are b terms in these sums).
(It’s worth noting that it’s not the case in all rings that multiplication, addition, and “plus 1” are related in these ways. E.g. it would be rough to argue that if A and B are matrices then the product AB corresponds to summing A with itself B times. So I think it’s a reasonable perspective that multiplication and addition are independent “in general” but the simplicity of the integers forces them to be intertwined).
Some other notes:
Defining −a to be the additive inverse of a is the same as defining it as the solution to a+x=0. No matter which approach you take, you need to prove the same theorems to show that the notion makes sense (e.g. you need to prove that −a+−b=−(a+b)).
Similarly, taking Q to be the field of fractions of Z is equivalent to adding insisting that all equations ax=b have a solution, and the set of theorems you need to prove to make sure this is reasonable are the same.
In general, note that giving a definition doesn’t mean that there’s actually any object that actually satisfies that definition. E.g. I can perfectly well define α to be an integer such that α⋅0=3 but I would still need to prove that there exists such an integer α. No matter how you define the integers, rational numbers, etc., you need to prove that there exists a set and some operations that satisfy that definition. Proving this typically requires giving a construction along the lines of what you seem to be looking for. So these definitions aren’t really meant to be a substitute for constructing models of the number systems.
You may be interested in this article and its successors which looks at a specific type of commutative hyperoperator.
This makes me think of eurisko/automated mathematicion, and wonder what minimal set of heuristic and concepts you can start with to get to higher math.
I was expecting a take on a particular tune by Andrew Bird.
I disagree maths “should be” done differently. I have a strong feeling the way stuff is defined usually nowadays has a property of being maximally easy to use. We don’t really need the definitions to look exactly like the intuition we had to invent them as long as the resulting objects behave exactly the same, and the less intuitive definitions are easier to use in proofs. For example, defining all powers directly as the Taylor series of e^x makes defining complex and matrix exponentials much easier / possible at all, and ad hoc proof this coincides with the naive version is simple. Also simplifies checking well-definedness a lot. Many more such examples.