The key point is that I only need one example, so it doesn’t matter whether or not the LLM simply got it lucky.
However, I do agree that if I assume that the LLM got lucky, this would be very bad in a practical sense, as this would make such insights take a long time to repeat by pure chance.
A key worldview here is that I think it’s rarely productive to debate whether or not AI has a fundamentally incapable at a certain task, because if we allow arbitrary resources, and especially arbitrary designs subject only to the loosest requirements of existence, we can solve any problem with AI/create ASI trivially, and the actual question is whether a certain approach can actually do the task in question with limits on resources.
Another way to say it is that you should always focus on the quantitative questions over the qualitative questions.
In general, overfocusing on fundamental barriers that remain even with infinite compute and not focusing on the finite compute case is one of the biggest barriers to AI progress/good AI discourse over the years (though admittedly for theoretical analysis like computational complexity part of the issue is proving stuff about computation is quite hard, and a lot of the issue is we believe that cryptography exists combined with the fact that we are quite bad at opening up the black-box of a computer, which is necessary in order to solve a lot of problems in computational complexity)
I’ll quote from a recent comment here:
My take on how recursion theory failed to be relevant for today’s AI is that it turned out that what a machine could do if unconstrained basically didn’t matter at all, and in particular it basically didn’t matter what limits an ideal machine could do, because once we actually impose constraints that force computation to use very limited amounts of resources, we get a non-trivial theory and importantly all of the difficulty of explaining how humans do stuff lies here.
There was too much focus on “could a machine do something at all?” and not enough focus on “what could a machine with severe limitations could do?”
The reason is that in a sense, it is trivial to solve any problem with a machine if I’m allowed zero constraints except that the machine has to exist in a mathematical sense.
A good example of this is the paper on A Universal Hypercomputer, which shows how absurdly powerful computation can be if you are truly unconstrained:
Another smaller point is I tend to have a somewhat more continuous model of how much creativity AIs currently have, and while there’s a point to be made that sufficiently terrible/bad creativity is in practice not different from having 0 creativity, and it might turn out to be too inefficient to scale creativity because in-context learning scales poorly due to context windows not being a sufficient replacement for a long-term memory/continual thinking just being way more efficient than pure mesa-optimization, but I do claim that literally 0 creativity or insights is not very likely based on priors.
Really, this could be fixed by simply switching the question from “Are LLMs creative/have insights at all?” to “Is there any way to get an LLM to autonomously generate and recognize genuine scientific insights at least at the same rate as human scientists?”.
And I’d easily agree that so far, LLMs have not actually done this, especially if we require it to be as good as the best human scientists.
To be clear, I’m not smuggling in any claim that would imply that since LLMs can do a non-zero amount of insights, that this reliably scales with the current architecture into enough insights to replace AI researchers and start AI takeoff by say 2030.
The key point is that I only need one example, so it doesn’t matter whether or not the LLM simply got it lucky.
However, I do agree that if I assume that the LLM got lucky, this would be very bad in a practical sense, as this would make such insights take a long time to repeat by pure chance.
A key worldview here is that I think it’s rarely productive to debate whether or not AI has a fundamentally incapable at a certain task, because if we allow arbitrary resources, and especially arbitrary designs subject only to the loosest requirements of existence, we can solve any problem with AI/create ASI trivially, and the actual question is whether a certain approach can actually do the task in question with limits on resources.
Another way to say it is that you should always focus on the quantitative questions over the qualitative questions.
In general, overfocusing on fundamental barriers that remain even with infinite compute and not focusing on the finite compute case is one of the biggest barriers to AI progress/good AI discourse over the years (though admittedly for theoretical analysis like computational complexity part of the issue is proving stuff about computation is quite hard, and a lot of the issue is we believe that cryptography exists combined with the fact that we are quite bad at opening up the black-box of a computer, which is necessary in order to solve a lot of problems in computational complexity)
I’ll quote from a recent comment here:
Another smaller point is I tend to have a somewhat more continuous model of how much creativity AIs currently have, and while there’s a point to be made that sufficiently terrible/bad creativity is in practice not different from having 0 creativity, and it might turn out to be too inefficient to scale creativity because in-context learning scales poorly due to context windows not being a sufficient replacement for a long-term memory/continual thinking just being way more efficient than pure mesa-optimization, but I do claim that literally 0 creativity or insights is not very likely based on priors.
Really, this could be fixed by simply switching the question from “Are LLMs creative/have insights at all?” to “Is there any way to get an LLM to autonomously generate and recognize genuine scientific insights at least at the same rate as human scientists?”.
And I’d easily agree that so far, LLMs have not actually done this, especially if we require it to be as good as the best human scientists.
To be clear, I’m not smuggling in any claim that would imply that since LLMs can do a non-zero amount of insights, that this reliably scales with the current architecture into enough insights to replace AI researchers and start AI takeoff by say 2030.