Even if you had a proof generating oracle, the problem is that most of the hard work is much more intangible, stuff like “what even are the right definitions?” and “what theorems should we be trying to prove?”. A proof oracle would sure be helpful, but most of the hard part is much harder to check.
As an example, basically every famous math problem (e.g. the Riemann hypothesis) is to my understanding important mainly because of the expectation that solving it would require significant theoretical developments. If instead you got a proof that you cannot readily mine conceptual insights out of, or if you don’t build a theory in the process, or if the AI builds the theory but no human understands it (say, because they can’t tell what’s junk and what’s legit without significant time and effort), then they are much less helpful. It also scales less with adding more chain of thought time or doing it in parallel a bunch.
Even if you had a proof generating oracle, the problem is that most of the hard work is much more intangible, stuff like “what even are the right definitions?” and “what theorems should we be trying to prove?”. A proof oracle would sure be helpful, but most of the hard part is much harder to check.
As an example, basically every famous math problem (e.g. the Riemann hypothesis) is to my understanding important mainly because of the expectation that solving it would require significant theoretical developments. If instead you got a proof that you cannot readily mine conceptual insights out of, or if you don’t build a theory in the process, or if the AI builds the theory but no human understands it (say, because they can’t tell what’s junk and what’s legit without significant time and effort), then they are much less helpful. It also scales less with adding more chain of thought time or doing it in parallel a bunch.