The more mainstream you go, the larger this effect gets. A lot of people seemingly want AI to be a nothingburger.
When LLMs emerged, in mainstream circles, you’d see people go “it’s not important, it’s not actually intelligent, you can see it make the kind of reasoning mistakes a 3 year old would”.
Meanwhile, on LessWrong: “holy shit, this is a big fucking deal, because it’s already making the same kind of reasoning mistakes a human three year old would!”
I’d say that LessWrong is far better calibrated.
People who weren’t familiar with programming or AI didn’t have a grasp of how hard natural language processing or commonsense reasoning used to be for machines. Nor do they grasp the implications of scaling laws.
Meanwhile, on LessWrong: “holy shit, this is a big fucking deal, because it’s already making the same kind of reasoning mistakes a human three year old would!”
FWIW, that was me in 2022, looking at GPT-3.5 and being unable to imagine how capabilities can progress from there that doesn’t immediately hit ASI. (I don’t think I ever cared about benchmarks. Brilliant humans can’t necessarily ace math exams, so why would I gatekeep the AGI term behind that?)
Now it’s two-and-a-half years later and I no longer see it. As far as I’m concerned, this paradigm harnessed most of its general-reasoning potential at 3.5 and is now asymptoting out around something. I don’t know what this something is, but it doesn’t seem to be “AGI”.
All “improvement” since then has just been window dressing; the models learning to convincingly babble about ever-more-sophisticated abstractions and solve ever-more-complicated math/coding puzzles that make their capabilities legible to ever-broader categories of people. But it’s not anything GPT-3.5 wasn’t already fundamentally capable of; and GPT-3.5 was not capable of taking off, and there’s been no new fundamental capability advances since then.
(I remember dreading GPT-4, and then it came out, and sure enough people were freaking out, and then I looked at what they were freaking out over, and it was… its ability to solve marginally harder physics puzzles? Oh. Oh no, that’s… scary?)
Now, granted, it’s possible that you can take these LLM things, and use their ability to babble their way through short-horizon math/coding puzzles to jury-rig something that’s capable of taking off. I don’t mean to say that LLMs are useless or unimpressive; that scenario is where my other 20% are at.
But it seems increasingly less likely to me with each passing day and each new underwhelming advancement.
Your observations are basically “At the point where LLM’s are AGI. I will change my mind”
If it solves pokemon one-shot, solves coding or human beings are superfluous for decision making. It’s already practically AGI.
These are bad examples! All you have shown me now is that you can’t think of any serious intermediate steps LLM’s have to go through before they reach AGI.
No, it’s possible for LLMs to solve a subset of those problems without being AGI (even conceivable, as the history of AI research shows we often assume tasks are AI complete when they are not e.g. Hofstader with chess, Turing with the Turing test).
I agree that the tests which are still standing are pretty close to AGI; this is not a problem with Thane’s list though. He is correctly avoiding the failure mode I just pointed it out.
Unfortunately, this does mean that we may not be able to predict AGI is imminent until the last moment. That is a consequence of the black-box nature of LLMs and our general confusion about intelligence.
The more mainstream you go, the larger this effect gets. A lot of people seemingly want AI to be a nothingburger.
When LLMs emerged, in mainstream circles, you’d see people go “it’s not important, it’s not actually intelligent, you can see it make the kind of reasoning mistakes a 3 year old would”.
Meanwhile, on LessWrong: “holy shit, this is a big fucking deal, because it’s already making the same kind of reasoning mistakes a human three year old would!”
I’d say that LessWrong is far better calibrated.
People who weren’t familiar with programming or AI didn’t have a grasp of how hard natural language processing or commonsense reasoning used to be for machines. Nor do they grasp the implications of scaling laws.
FWIW, that was me in 2022, looking at GPT-3.5 and being unable to imagine how capabilities can progress from there that doesn’t immediately hit ASI. (I don’t think I ever cared about benchmarks. Brilliant humans can’t necessarily ace math exams, so why would I gatekeep the AGI term behind that?)
Now it’s two-and-a-half years later and I no longer see it. As far as I’m concerned, this paradigm harnessed most of its general-reasoning potential at 3.5 and is now asymptoting out around something. I don’t know what this something is, but it doesn’t seem to be “AGI”.
All “improvement” since then has just been window dressing; the models learning to convincingly babble about ever-more-sophisticated abstractions and solve ever-more-complicated math/coding puzzles that make their capabilities legible to ever-broader categories of people. But it’s not anything GPT-3.5 wasn’t already fundamentally capable of; and GPT-3.5 was not capable of taking off, and there’s been no new fundamental capability advances since then.
(I remember dreading GPT-4, and then it came out, and sure enough people were freaking out, and then I looked at what they were freaking out over, and it was… its ability to solve marginally harder physics puzzles? Oh. Oh no, that’s… scary?)
Now, granted, it’s possible that you can take these LLM things, and use their ability to babble their way through short-horizon math/coding puzzles to jury-rig something that’s capable of taking off. I don’t mean to say that LLMs are useless or unimpressive; that scenario is where my other 20% are at.
But it seems increasingly less likely to me with each passing day and each new underwhelming advancement.
What observations would change your mind?
See here.
Your observations are basically “At the point where LLM’s are AGI. I will change my mind”
If it solves pokemon one-shot, solves coding or human beings are superfluous for decision making. It’s already practically AGI.
These are bad examples! All you have shown me now is that you can’t think of any serious intermediate steps LLM’s have to go through before they reach AGI.
No, it’s possible for LLMs to solve a subset of those problems without being AGI (even conceivable, as the history of AI research shows we often assume tasks are AI complete when they are not e.g. Hofstader with chess, Turing with the Turing test).
I agree that the tests which are still standing are pretty close to AGI; this is not a problem with Thane’s list though. He is correctly avoiding the failure mode I just pointed it out.
Unfortunately, this does mean that we may not be able to predict AGI is imminent until the last moment. That is a consequence of the black-box nature of LLMs and our general confusion about intelligence.
Why on earth would pokemon be AGI-complete?