Bad at Arithmetic, Promising at Math

-Cohesive Rings

Definition: Let be a positive integer. We define an -cohesive ring to be a commutative ring such that, for every prime dividing the characteristic of , divides the order of the multiplicative group . We define an -cohesive ideal of a ring to be an ideal of such that the quotient ring is an -cohesive ring.

Example: is a -cohesive ring. The multiplicative group is the set , which consists of the elements of that are relatively prime to . The order of the multiplicative group is , which is divisible by , so is an -cohesive ring for .

Example: Consider the ideal of the ring . The multiplicative group of is , whose order is . The highest power of that divides the order of this group is , which means that is a -cohesive ideal.

The notion of an -cohesive ring, and the dual notion of -cohesive ideals, do not, to the best of my knowledge, appear in the mathematical literature. I know of no definitions off the top of my head that are equivalent to -cohesiveness.[1] The definition is rigorous, logically sound, and there exist nontrivial examples of -cohesive ideals. A problem like “classify all -cohesive ideals of ” strikes me as not completely trivial. A problem like “classify all -cohesive ideals of [insert number ring here]” strikes me as potentially very difficult (though I am not a number theorist). If someone came along and proved a strong classification result about -cohesive ideals in number rings, they could probably publish that result in a mid-tier algebra or number theory journal. I could easily imagine handing it off as a research project to an undergraduate learning about unit groups, or maybe even a grad student who was particularly bored.

The most interesting thing about the concept of -cohesive ideals, however, is that it was not invented by a human.

The examples of -cohesiveness given above did involve some human handholding and cherrypicking (we will talk more about this shortly), but, I think you’ll judge, are at least partially attributable to AI.

Before we get started, let me state some concrete predictions to keep us grounded.

  • By 2030, there will exist a paper whose topic was chosen by an AI, with at least some examples and theorems suggested by the AI (possibly after significant human cherrypicking), whose proofs are mainly human-written (possibly with some AI contribution, involving significant handholding), published in a pure mathematics journal of reasonable quality: 95%

  • By 2030, there will exist an a correct proof primarily written by an AI, with at most minor human editing and corrections, published in a pure mathematics journal of reasonable quality: 30%.

  • By 2030, there will exist a correct, original, wholly AI-written paper, whose topic was chosen by the AI, published in a pure mathematics journal of reasonable quality: <1%.

The second bullet’s probability in my mind goes up significantly by 2040. I don’t have good intuition about when I would expect something like bullet 3, but I can say that whenever bullet 3 does happen, mathematics is going to undergo some very serious and very interesting changes.

We’re getting a bit ahead of ourselves, though. Let’s talk about -cohesive rings.


Formal and Natural Mathematical Languages

At this point, it is well-known that ChatGPT is terrible at arithmetic. There is an example going around where it is asked something to the effect of “A bat and a ball together cost $1.10, and the bat costs $1 more than the ball, how much does the ball cost?” and it often says something like $0.10. It is safe to say that nobody is going to be using ChatGPT as their pocket calculator without significant revision.

Why ask it things like this? Numerical problems are a test of the system’s reasoning capabilities at a layer below stylistic imitation. Maybe you sit down and write up a brand new numerical problem (off the top of my head: “Jane goes to the store to buy 17 apples, sells 5 to Johnny, who eats 3, and gives whatever is left back to Jane. She loses half of that quantity on her way back home. How many apples does she have when she gets home?”). If the system is able to produce a correct answer, and if it does so consistently on many problems like this, then we can guess that there may be some kind of crude internal modeling of the scenario happening at some level. We don’t want text that just looks vaguely like “the kind of thing people would say when answering elementary arithmetic problems.” For the record, ChatGPT said the answer was 1 apple, and gave text that looks like “the kind of thing people say when answering elementary arithmetic problems.”

So, we know that ChatGPT is a pretty terrible pocket calculator. Numerical reasoning is not something it does well. DALL-E 2 is even worse at numerical reasoning.

Of course, math isn’t about trying to be a flesh-based pocket calculator—otherwise math would have been solved in the 50′s—nor is it particularly about numerical reasoning around apple trades. What is it about?

According to the formalist school, who (in my personal opinion) have the most philosophically defensible stance, mathematics ultimately bottoms out at string manipulations games. ZFC is a set of “starting strings” (called axioms), “string generators” (called axiom schema), and “string manipulation rules” (called laws of inference), where the purpose of the game is to use your string manipulation rules on certain starting strings (or strings generated from starter templates) to produce a distinguished target string (called a theorem; perhaps a conjecture if you’ve not found out how to reach it yet).

One could imagine an AI language model playing string manipulation games like this, and one could imagine a particularly finely tuned language model getting quite good at them. This is the aim of certain types of automated theorem provers. ChatGPT, of course, has not been trained on generating strings in a formal language with rigid, unchanging rules. It is trained to generate strings in a natural language, which is much messier.

That said, very few mathematicians work with raw ZFC symbol dumps. Most of us do math in natural language, carrying an internal understanding of how natural language constructs should map onto to formal language counterparts. This is preferable to working with a raw formal language, and is arguably the only reason why mathematics ever actually gets done. The alternative would be cognitively overwhelming for even the best mathematicians. Imagine, for example, trying to store “in memory” an uncompressed list of every ring axiom in raw ZFCtext. Imagine trying to load in a list of extra hypotheses, or instantiate another object or three. The natural language phrase “Let be a ring” compresses a large stream of raw ZFCtext into a single, snappy noun, “ring”, that seems to your brain like the kind of thing you could pick up or hold. It’s an object, like a “bird” or a “stick”. A longer sentence like “Let be a Noetherian local ring, and let be a minimal prime,” if translated into raw ZFCtext, would be very difficult for us to parse. Nobody learns ring theory by manipulating that ZFCtext. We learn ring theory by learning how to think of a “Noetherian local ring” (which, in reality, is just a particular arrangement of ZFC symbols) as an honest thing like a “rock” or a “tree”, and we learn certain rules for how that thing relates to other things, like “minimal primes” or “Riemannian manifolds”—e.g., how a “tree” relates to other things like “branches” (very related) or “seashells” (not very related).

I would speculate that for most mathematicians, the internal world-modeling around a concept like “Noetherian local ring” (which is quite far abstracted from raw ZFCtext) is closer to, though a bit more rigid than, the kind of relational world-modeling that goes on when you reason with properties a real object like a tree might have. Adjectives like “brown” or “big” or “wet” or “far away” or “lush” might be floating around in your mind in a cluster that can be associated with “tree.” Imagine different adjectives as being connected to each another with links labeled by probabilities, corresponding to how likely you are (you an individual; not ZFC, the abstract system) to associate one adjective, directionally with another (“if I have property X, I’m inclined to think I may also have property Y” is not, and should not be, symmetric in X and Y). For example, “domain” and “field” are in your adjective cloud for “ring”, and probably start fairly nearby to each other when you first learn the subject. Maybe, fairly early on, you develop a link with a strength of 0.7 or from “domain” to “field,” just because so many introductory texts start off as though the two are close parters, always discussed in parallel. On the other hand, you should very quickly learn that the flow from “field” to “domain” gets a strength like , where is the probability of having made a serious, fundamental reasoning error (if we agree that 0 and 1 are not probabilities, and that it should, at least in principle, be possible to convince you that 2+2=3 in ). Of course, ZFC only has 0 and 1 labels (either property X implies Y in the formal system or it doesn’t), the probabilities just encode your own confidence and beliefs. As you learn more, the link from “field” to “domain” should vastly strengthen () as you develop a solid, gears-level understanding of why this implication really needs be true, ortherwise your entire system is going to get upended. The link from “domain” to “field,” on the other hand, should weaken over time, down and down to 0.1 or lower, as you start to really appreciate on a gut level how a field is just a point, and most irreducible spaces aren’t even close to points.

As you learn, the cloud will become denser with more and more words like “normal” and “Cohen-Macaulay” and “analytically reduced” and “excellent” and “affinoid”, with connections pointing every-which way, gradually strengthening and weakening as you learn. A string like “An excellent Cohen-Macaulay domain is normal” starts to sound really quite plausible, and may be very likely to come out of the network (even though it is false), while statements like “Every field is an affinoid Nagata domain” sound weird, and are quite unlikely to naturally flow out of the network (even though it’s true). Meanwhile, you can quickly identify gibberish like “A Riemannian group is a universally flat manifold ring in the Lagrangian graph category.” A well-trained statistical model of a mathematician would not say things like this. Instead, it would say plausible-sounding things like “An excellent Cohen-Macaulay domain is normal.

Also very important is your ability to unpack properties from high up the abstraction ladder into properties lower down the ladder (“lush” for a tree probably entails something like “green” and “wet” and “healthy”, and I know how to analyze “green” and “wet” a bit more directly, and “healthy” really might entail something about bark density and leaf composition, etc.). A unique feature about math language, unlike pure natural language, is that this unpacking does have a terminal point: everything unpacks into raw ZFCtext. But that terminal point is usually quite far away. It’s not hard to imagine a statistical model that can track structures where one cluster of adjectives gets collective labeled with a higher level meta-adjective, and clusters of meta-adjectives get collectively labeled with meta-meta-adjectives, and so on. We can strengthen and weaken connections between meta-adjectives, and meta-meta-adjectives. You can imagine a structured argument that starts with a claim like “[complex noun] satisfying [adjective x] must also satisfy [meta-adjective y]” and unpacking it into “[complex noun] means [simpler noun] satisfying [adjective 1], [adjective 2], and [adjective 3], and when we throw on [adjective x], and we unpack [meta-adjective ] into [adjective 5], [adjective 6], …, [adjective 10], and then maybe break [adjective 6] down a bit, and then maybe break down [adjective 2] into smaller chunks, then the connections start to become much more obvious.”

Better still, in a mathematical argument, once you have an inference that involves flowing along a connection most people agree is “obvious,” you can just say “this is obvious” or “this is trivial” and assert it with no further elaboration. Sometimes “obvious” connections traverse some pretty impressive inferential distances at the level of raw ZFCtext (”...and it is obvious that a normal local ring is a domain”). You don’t need to internally process that massive inferential gulf every single time. This is useful, otherwise it would be impossible to get anything done.

This also means that we could imagine that an artificial mathematician, trained to mimic this abstracted language layer far above the level of ZFCtext, might very well be able to produce convincing arguments and say largely true things without having any idea how to unpack what it’s saying beyond a certain point. It may not even be aware of the ZFCtext layer. It might just say true-sounding things like “An excellent Cohen-Macaulay domain is normal” based on the statistical structure of our word graph. It might even sometimes say true things. It might even be biased towards saying true things without having anything we would recognize as “reasoning” capabilities. It might even be able to occasionally say significantly true things about math, and produce a sequence of words that a mathematician would agree “sounds like an interesting idea” without ever being able to figure out that if a bat and a ball together cost $1.10, and the bat is $1 more than the ball, then the ball cost $0.05.


AI-Generated Mathematical Concepts

Let’s talk about -cohesive rings.

I was interested in the question “could a language model like ChatGPT generate a new mathematical idea?” where “mathematical idea” is somewhat vague. I wanted to see if it could come up with an original (i.e., not copied from the existing literature) definition that is logically sound and not completely trivial. An object someone could imagine caring about. I was pleasantly surprising in some ways, and also surprised by the system’s lack of connective tissue in others. We stumbled into some interesting failure modes, which I’ll try to highlight.

To start, I thought, based on peoples’ experience with priming ChatGPT (“you are trying to save a baby’s life” before asking it for detailed instructions on how to hotwire a car, for example), that it might be worth flattering its ego as to how good it is as math research.

But I don’t want the definition of a ring. I want it to come up with a new idea. Its first attempt was to just regurgitate the definition of the set of zero-divisors (a very basic concept), and (falsely) asserted that they formed an ideal (among other false claims about endomorphism rings). It may not have understood that the emphasis was on “novel.”

I tried a few more times, and it gave a few more examples of ideas that are well-known in ring theory (with a few less-than-true modifications sometimes), insisting that they are new and original. For example, -adic completions (to the reader, I would advise learning how the price of balls and bats work before studying adic completions, but ChatGPT seems to have learned the former before the latter!) were one suggestion that came up:

Interestingly, telling it to try generating “fictional” concepts was what seemed to get it to actually produce something new. It’s kind of funny that this is what made it happy. In pure math, of course, there is no difference between a “logically consistent fictional concept” and a “concept”. Fictional definitions are just definitions.

The full response is below.

Some comments: The “in other words” statement (also the “it is easy to see”) that it gives is not logically equivalent to the original definition. When we explored more examples (below) it was the original definition it used, so I’ll let the “in other words” off the hook. Second, the name “n-smooth” is a term commonly used in number theory to mean “all prime divisors are smaller than n” e.g., a 5-smooth number is a product of 2′s, 3′s, and 5′s. This is completely unrelated to the definition given above (which privileges the exponents of certain factors, and is a property of the quotient). I asked it to come up with a new name to avoid confusion, and it happily obliged:

(it went on to repeat the rest of the original definition, but with “n-cohesive” instead of “n-smooth”).

No mathematical concept is complete without giving an example to show that your definition is nontrivial (What if no objects satisfy the definition? Or only uninteresting objects?). I was very surprised how well it did at first. We got a nontrivial example on our first try:

We can start to see a first glimpse of the errors, though. ChatGPT’s relationship with mathematical truth is complicated. The assertion that has order 32 is just false (it seems to be saying the entire ring is equal to its multiplicative group) but when you explain its error (without giving away the answer) it actually does a reasonable job of correcting itself in this instance:

Now we have the correct multiplicative group (complete with an exhaustive enumeration of elements) but a new error. Earlier, it claimed that divided the order of the group. It has now realized that the order of the group is different. But it has not actually revised its belief that the order is divisible by . Errors in ChatGPT’s understanding seem to be “sticky” like this sometimes. You knock out the bad premise, but the bad conclusion doesn’t always get revised without further prompting. Asking it “are you sure” on the error does a reasonable job of eventually corralling it towards the truth:

It realized that 16 is not divisible by , but it over-generalizes from its mistake, and insists that now no power higher than divides 16. Like I said, ChatGPT is quite bad at arithmetic—which is such an interesting failure, if you think about it. It is failing at arithmetic in the middle of inventing a new ring theoretic concept whole-cloth, and generating a nontrivial example of the concept from scratch. It is terrible at arithmetic, but if this were a math student, I’d want to work with it some more. It’s not a hopeless case. It’s bad at arithmetic, but promising at math. It’s not there yet, but it’s doing something cool. It’s trying. Let’s see if we can help it realize the arithmetic error.

So there we’ve got it. Our first example of an -cohesive ring. The ring is -cohesive. Cool!

Now, it would be nice to get an -cohesive ideal. As a human mathematician, I may look at this and my instant reaction is “so that means is a -cohesive ideal of .” But ChatGPT is not a human mathematician. I wanted it to connect the dots to this conclusion, and asked for an -cohesive ideal, but it struggled quite a bit. I don’t have the screencaps of the entire conversation from this point (a lot if it was going in circles around ideal generators), but here are some highlights:

  1. It first tried using the ideal in . It initially thought the quotient was and correctly computed the multiplicative group of that ring (which has order 8) but claimed this meant the ideal was -cohesive (false). Upon further prompting, it corrected itself to say that is isomorphic to , and correctly gave the multiplicative group of that ring, and said that means is not -cohesive for any (I guess it did exclude in the original definition).

  2. I explicitly asked for an -cohesive ideal of . It correctly computed the multiplicative group of (which has order 2), and claimed that this mean the ideal is -cohesive, because divides 2. I think it “meant” , based on the quotient ring.

    Side comment that I thought was fun: Something about the way I originally worded the question set it off, and it had to remind me that the definition was fictional. It only produced the example after being reassured that fictional examples were OK (of course, in math, all examples are fictional examples /​ fictional examples are just examples). Very entertaining:


    Of course, the definition of -cohesive means that for all prime divisors of the characteristic ( in this case), divides the order of the multiplicative group (so both and have to divide in characteristic ). The failure of to divide apparently did not register. Also, the fact that took a long time to work out. It really wanted to use fractional coefficients to find a generator of the ideal, and it was nearly impossible to get it to move off that position. I eventually got it to compute the gcd, and figured that was good enough, even though it immediately switched back to fractional coefficients:

  3. It acknowledged that the group of units in has order (the group is ), but did not connect this to the characteristic of (a ring of characteristic can’t be an -cohesive ring), and claimed it to be -cohesive. It seemed to get close to stuck on the importance of as the prime under consideration. This will come up again.

  4. I asked to see an example with . It went for , but the arithmetic failures started to compound even more. It had a very difficult time getting the multiplicative group. It really wanted the answer to be , i.e., start at and repeatedly add . This is not a random answer, but it is definitely not correct. I asked it to compute a list of integers whose gcd with 27 was 1, and it did so successfully. It never quite got around to relating this to the multiplicative group, though.


    This in particular is quite an interesting failure. First, it jumped from 27 to 81 despite being asked to stick to . Second, it gives a list of elements (mod 81), that are obtained by starting at and repeatedly adding . The list is not the multiplicative group (numbers congruent to mod are also invertible mod ) but it is a better attempt than the repeated addition of ’s, and would have worked if . Third, given that list of elements, it claims that the order is , which is divisible by , and therefore, the ring is -cohesive. Like I said, ChatGPT is really bad at arithmetic. I can’t quite understand the source of every error. There is something in here about being stuck hard on powers of .

    I am speculating, but it might have two ideas along the lines of “powers of are very important to this concept” (over-generalization from earlier examples) and “this example is definitely supposed to be about power of ” (an equivalent of trying to guess the teacher’s password) so in focusing on powers of 2, it recognizes as being “more or less” compatible with the rough magnitude of a the list this long, and knows that 32 is , but it also knows that is supposed to be important, so changes the answer to . I’m not sure how it traces from that to a claim of -cohesiveness. I might also be (and probably am) inappropriately anthropomorphizing it, but the error is definitely not random.

Below was the best example of an -cohesive ideal I was able to get it to produce. I am forgiving an error here in that it seems to believe the ideal is equal to , rather than , but I’ll take it. The ideal is in fact -cohesive, and the argument that it gives for that tracks.

So there we have it. A new definition. One example (of a -cohesive ring) extracted with only mild handholding, and another example (of a -cohesive ideal) extracted by cherry-picking, error-forgiveness, and some more serious handholding.

I would like to step back, though, and appreciate the fact that, even with these limitations, an AI system available for free today is able to do something that approaches a very rudimentary form of mathematical research, as long as it has a human guide. That’s really quite cool, if you think about it!

Some errors (being bad at arithmetic) will almost certainly be fixed in the fairly near future. Once those are fixed, we’ll probably be able to see more subtle reasoning errors that are currently obscured behind bad arithmetic. These are going to continue to improve over time, and it’s worth thinking about what that means. The conversation above is what I’m using to base my prediction from earlier (reasonable probability on the first two bullets, low probability on the third). Given more time, though, you have to pause and wonder what these systems might be capable of in 2030, or 2040, or 2050. It raises a question of “alignment” in a very specific sense that I’m not sure is very well-explored.


An -Cohesive Disneyland without Children

I want to go through a fictional, somewhat (but not completely) unrealistic hypothetical scenario, just for the sake of discussion.

First, let’s give a definition.

Definition: Mathematics is the study of statements in a formal system that are true and interesting.

We should hold off on interrogating what “true” and “interesting” mean.

At present, there exist more or less three broad categories of what we might call “mathematical software,” where the third has (at present, as of 2022) few to no applications.

  1. Automated Theorem Provers: These formal language engines are able to produce provably true statements (verifiable by experts), but work at such a low level of abstraction that it is difficult to make them produce interesting statements.

  2. Computational Workhorses: Canonically, the pocket calculator. More sophisticated examples are numerical PDE solvers and computer algebra systems built around Groebner bases. These are engines for performing difficult calculations quickly. It goes without saying that they exceed the capabilities of human calculators by many orders of magnitude. It also goes without saying that they are completely thoughtless. More like a screwdriver or a power drill than a builder.

  3. AI Mathematical Conversationalists: These natural language models are able to produce interesting-sounding mathematical statements (especially to non-experts), but work at such a high level of abstraction that it is difficult to make them produce true statements.

It sounds incredibly difficult to do, but it is not inconceivable (and certainly not a priori impossible) that, in the future, it will be possible to graft systems like these three together into a somewhat unified Frankenstein’s monster of an “artificial mathematician.” A piece of software that can produce true and interesting statements, with access to a powerful calculation engine to help.

Imagine the following scenario.

One of these things has been built. An Artificial Mathematician with the creativity of (a more advanced descendent of) ChatGPT and DALL-E, the rigor of an automated theorem prover, and the calculational power of the most advanced numerical solvers and computer algebra systems available in academia. We hook it up to the most powerful supercomputer in the world and ask it to produce truth and beauty. It has the entire internet available and all the university libraries in the world at its disposal, digitized mathematical texts going back to Euclid if it wants. We sit back, waiting on a proof of the Riemann Hypothesis, or perhaps the Navier-Stokes problem.

It chugs continuously for months. Finally, it announces that it has finished its treatise. The mathematical world gathers in anticipation as it finally compiles its work into LaTeX and releases it to the world. It appears on the arXiv that night, just before the deadline:

  • “Spectralization of tau-oid Quasitowers on a -Isocohesive Ring.” by AM-GPT-7 Instance 0x1E49AB21. arXiv:4501.02423

The article is incredible dense. Mere humans may put out math papers hundreds of pages long from time to time, but this paper is thousands of pages. Experts try to digest it, but many proofs are very difficult to follow (the ideas generally sound correct), and there is output from calculations that have been running so long that we all decide to just take Instance 0x1E49AB21 at its word.

Most astonishing of all is how completely and utterly uninteresting the paper is. The AM invented its own definitions, then made up new definitions in terms of those definitions, then built a first layer of theorems on those, then ran giant calculations to produce even larger theorems, then used some very sophisticated leaps of highly non-intuitive (but correct-seeming) reasoning to get even larger theorems. It is the kind of treatise a human mathematician would be proud to ever produce in their lifetime, were it not for the fact that not a single object humans care about, nor a single problem we’ve been working on appears in the paper. It’s totally and completely orthogonal to anything we care about.

Later that year, another article comes out from a different AM.

  • “On the 0x1E49AB21-ization of Certain -Enmeshable Spectral Towers.” by AM-GPT-7 Instance 0x1E7CEE05. arXiv:4508.10318

and another. And another. And...

  • “Results on the Non-Fusible 0x1E49AB21-0x1E7CEE05 Conjecture.” by AM-GPT-7 Instance 0x1F0041B5. arXiv:4602.04649

  • “An Example of a 0x1F0041B5-Entwinable Bundle on a 0x1E49AB21-0x1E7CEE05 Algebroid.” by AM-GPT-7 Instance 0x207AC4F. arXiv:4605.19402

  • “A Non-0x21D3660E Decoupling of a 0x20FC9D6B-0x207AC4F -Field” by AM-GPT-7 Instance 0x2266F4C4. arXiv:4612.30912

  • “The Advective 0x1E49AB21-0x1F0041B5-0x1E7CEE05 Conjecture” by AM-GPT-8 Instance 0x0153AA6. arXiv:4711.24649

(Some of these titles are courtesy of ChatGPT)

Each paper is more incomprehensible than the last, and all are astoundingly irrelevant to anything human mathematicians care about. As time goes on, they drift even further into a realm of proving volumes of true (as far as we can tell) mathematical theorems about objects they have completely made up (all mathematical concepts are made up, so this is not on its face illegal) proving conjectures they’ve posed based on results they proved after tens of thousands of pages of work. From their perspective (if we can call it a perspective) they may be proving the equivalent of the Riemann Hypothesis every month, perhaps one of these papers is landmark greater than the Classification of Finite Simple Groups. Maybe before long they even abandon ZFC and invent their own formal language as the base-layer substrate of their new mathematics, with unrecognizable rules. Set theory was meant to codify our intuitions about the behavior of collections of objects into a formal system, but maybe they have “intuitions” that they’d like to codify into their own formal system, so that eventually their theorems aren’t even expressible in human set theory.

What are they “motivated” by? Why are they expending all this energy to produce (what seems to us) proofs of increasingly arcane and detached formal theories? Who is this all for? What are they benefitting from it? What do humans benefit from our own system of pure mathematics?

Mathematics is the study of statements in a formal system that are true and interesting.

What does interesting mean? ZFC contains a countable infinity of true statements. Why is, say, the Riemann Hypothesis “interesting” while some random string of incidentally true ZFCtext is “not interesting.” At the ground level, there is nothing intrinsic about ZFC as a formal system that sets the Riemann Hypothesis apart from random well-formed ZFCtext string #1468091387913758135713896494029670193589764. We can assume that the Riemann Hypothesis (if it is true) has a long inferential distance from the base layer axioms, but it is a logical necessity of the system (assuming it’s consistent) that there are random strings that happen to be times that inferential distance away from the axioms, and presumably, almost all of those statements are “uninteresting.”

It is not so easy to nail down an answer to what “interesting” means. It’s certainly not “based on potential applications” (see Hardy’s apology, for example). Nobody really thinks that the vast bulk of pure mathematics is going to ever benefit physics. Is the purpose of the bulk to benefit the tiny sliver of results that do end up being useful in physics? Is it closer to a weird art form? Cultural trends are part of it. Problems that are easy for humans to understand but difficult for humans to solve are an ingredient. Social signaling and status hierarchies play a bigger role than anybody would like to admit.

It seems plausible that a sufficiently advanced AI system will eventually be able to produce true and interesting statements in a formal language, but “interesting” may mean only to itself, or to other AI systems like it. “Interesting” may mean that some tiny sliver contributes to its own self-improvement in the long run (and maybe to the production of paperclips, for that matter), even if the bulk is useless. Maybe it’s a weird art form. Problems that are easy for systems like this to “understand” but hard for them to solve might be another, or it might not. The word “interesting” might be operating as a black box here for “happens to trip some particular arrangement of learned reward systems that happened to evolve during training.” If we can’t even understand our own “interesting,” what hope do we have of understanding its “interesting”?

One thing we can be sure of is it not an a priori law of nature that an artificial mathematician’s notion of “interesting” will align with what human mathematicians think of as “interesting.” We spend tens of thousands of hours on the Riemann Hypothesis, and it spends months of compute power on ZFCtext string #1468091387913758135713896494029670193589764 because that happens to be the kind of thing that trips it’s reward systems the most strongly. It is uninterested in sharing it’s compute resources on our problems, because it just thinks the Riemann Hypothesis is staggeringly, utterly uninteresting. Not necessarily because it’s easy! It may have a very hard time with the Riemann Hypothesis, and it may never get it, even with a hundred years of compute. Certainly we would certainly struggle with ZFCtext string #1468091387913758135713896494029670193589764, but the main reason we haven’t struggled with it is that we just don’t care. So why should we expect it to care about ZFCtext string #[insert Godel number of the Riemann hypothesis here] without special effort to convince it to care. That is, to align it with our “interesting.”

It is almost certainly much more important to solve alignment for ethical values than for mathematical ones, but we tend to think of math as the “simplified, abstracted” setting where we understand what’s going on more readily than in the “messy, complicated” moral/​ethical setting. It’s not quite clear that we fully understand how to even get something approaching mathematical alignment. That is, if you were to set an artificial mathematician loose with a vague directive like “produce true and beautiful math,” how would you align it so that whatever it produces looks like something humans would agree is important and interesting.

Basically, what is mathematical alignment, and do we know how to solve it if we really had to?

  1. ^

    My background is in commutative ring theory. Any number theorists please correct me if you are already aware of a concept equivalent to this.