The paradigm and agenda here are interesting, but haven’t been explained much in this post. The sponsors of the prize seem to want to derive probabilistic number theory from an information theory framework, as a step towards understanding what AI will be able to deduce about mathematics.

Reference 1 says a few interesting things. The “Erdős–Kac theorem” about the number of prime factors in an average large number, is described as “impossible to guess from empirical observations”, since the pattern only starts to show up at around one googol. (The only other source I could find this claim, was a remark on Wikipedia made by an anon from Harvard.)

This is in turn said to be a challenge to the current scientific paradigm, because it is “provably beyond the scope of scientific induction (and hence machine learning)”. Well, maybe the role of induction has been overemphasized for some areas of science. I don’t think general relativity was discovered by induction. But I agree it’s worthwhile to understand what other heuristics besides induction, play a role in successful hypothesis generation.

In the case of Erdős–Kac, I think the proposition is a refinement of simpler theorems which do have empirical motivation from much smaller numbers. So perhaps it arises as a natural generalization (natural if you’re a number theorist) to investigate.

Reference 1 also claims to prove (6.3.1) that “no prime formula may be approximated using Machine Learning”. I think this claim needs more work, because a typical ML system has an upper bound on what it can output anyway (because it has a fixed number of bits to work with), whereas primes are arbitrarily large. So you need to say e.g. something about how the limits to approximation, scale with respect to that upper bound.

The paradigm and agenda here are interesting, but haven’t been explained much in this post. The sponsors of the prize seem to want to derive probabilistic number theory from an information theory framework, as a step towards understanding what AI will be able to deduce about mathematics.

Reference 1 says a few interesting things. The “Erdős–Kac theorem” about the number of prime factors in an average large number, is described as “impossible to guess from empirical observations”, since the pattern only starts to show up at around one googol. (The only other source I could find this claim, was a remark on Wikipedia made by an anon from Harvard.)

This is in turn said to be a challenge to the current scientific paradigm, because it is “provably beyond the scope of scientific induction (and hence machine learning)”. Well, maybe the role of induction has been overemphasized for some areas of science. I don’t think general relativity was discovered by induction. But I agree it’s worthwhile to understand what other heuristics besides induction, play a role in successful hypothesis generation.

In the case of Erdős–Kac, I think the proposition is a refinement of simpler theorems which do have empirical motivation from much smaller numbers. So perhaps it arises as a natural generalization (natural if you’re a number theorist) to investigate.

Reference 1 also claims to prove (6.3.1) that “no prime formula may be approximated using Machine Learning”. I think this claim needs more work, because a typical ML system has an upper bound on what it can output anyway (because it has a fixed number of bits to work with), whereas primes are arbitrarily large. So you need to say e.g. something about how the limits to approximation, scale with respect to that upper bound.

edit: I meant to link to some related posts: Logical Share Splitting, Logical Probability of Goldbach’s Conjecture.