My personal view is that OA is probably wrong about how far the scaling curves generalize, with the caveat that even eating math and coding entirely ala AlphaZero would be still massive for AI progress, though compute constraints will bind eventually.
My own take is that the o1-approach will plateau in domains where verification is expensive, but thankfully most tasks of interest tend to be easier to verify than to solve, and lots of math/coding are basically ideally suited to verification, and I expect it to be way easier to make simulators that aren’t easy to reward hack for these domains.
what’s the ground-truth oracle for ‘came up with a valuable new theorem, rather than arbitrary ugly tautological nonsense of no value’?
Eh, those tautologies are both interesting on their own, combined with valuable training data so that it learns how to prove statements.
I think the unmodelled variable is that they think software-only type singularities to be more plausible, ala this:
Or do they just think that these remaining issues are the sort of thing that AI-powered R&D can solve and so it is enough to just get really, really good at coding/math and they can delegate from there on out?
My personal view is that OA is probably wrong about how far the scaling curves generalize, with the caveat that even eating math and coding entirely ala AlphaZero would be still massive for AI progress, though compute constraints will bind eventually.
My own take is that the o1-approach will plateau in domains where verification is expensive, but thankfully most tasks of interest tend to be easier to verify than to solve, and lots of math/coding are basically ideally suited to verification, and I expect it to be way easier to make simulators that aren’t easy to reward hack for these domains.
Eh, those tautologies are both interesting on their own, combined with valuable training data so that it learns how to prove statements.
I think the unmodelled variable is that they think software-only type singularities to be more plausible, ala this:
Or this:
https://www.lesswrong.com/posts/FG54euEAesRkSZuJN/ryan_greenblatt-s-shortform#z7sKoyGbgmfL5kLmY