What is your opinion on the recent developments of LLMs? I feel the last 9 months since your comment was made have shown they are not slowing down.
My argument at the time was that Chinchilla scaling might be slowing down, but that there might be cheaper ways to improve LLMs. Unfortunately, I can’t evaluate the accuracy of my prediction, because I don’t know (for example) how much model size changed between Claude 3.7 and Claude 4.5 models. Did their parameter count go up? Did they increase their pre-training by an order of magnitude? Or did they just continue to lean into reasoning, more RL, and better training data?
But yes, I agree that Claude Code has improved dramatically in the last 12 months, and I doubt that it has stopped.
Now we’re in a weird place:
I see Claude Opus 4.5 and 4.6 as a pretty convincing proof of concept for AGI. These models still suffer under significant limitations, but it’s clear to me that that they do think, and that they already exceed median human intelligence on certain kinds of tasks.
But at the same time, I don’t think that current techniques will allow removing several of the those significant limitations.[1]
I don’t want to get into specifics here, because I personally fear that rigorous alignment is impossible. And so every year that nobody makes those breakthroughs is (frankly) one more year that the people I love get to live in a world where humans actually control our fate.
My argument at the time was that Chinchilla scaling might be slowing down, but that there might be cheaper ways to improve LLMs. Unfortunately, I can’t evaluate the accuracy of my prediction, because I don’t know (for example) how much model size changed between Claude 3.7 and Claude 4.5 models. Did their parameter count go up? Did they increase their pre-training by an order of magnitude? Or did they just continue to lean into reasoning, more RL, and better training data?
But yes, I agree that Claude Code has improved dramatically in the last 12 months, and I doubt that it has stopped.
Now we’re in a weird place:
I see Claude Opus 4.5 and 4.6 as a pretty convincing proof of concept for AGI. These models still suffer under significant limitations, but it’s clear to me that that they do think, and that they already exceed median human intelligence on certain kinds of tasks.
But at the same time, I don’t think that current techniques will allow removing several of the those significant limitations. [1]
I don’t want to get into specifics here, because I personally fear that rigorous alignment is impossible. And so every year that nobody makes those breakthroughs is (frankly) one more year that the people I love get to live in a world where humans actually control our fate.