Before June 2022 was the month of the possible start of the Second American Civil War, it was the month of a lively debate between Scott Alexander and Gary Marcus about the scaling of large language models, such as GPT-3. Will GPT-n be able to do all the intellectual work that humans do, in the limit of large n? If so, should we be impressed? Terrified? Should we dismiss these language models as mere “stochastic parrots”?
I was privileged to be part of various email exchanges about those same questions with Steven Pinker, Ernest Davis, Gary Marcus, Douglas Hofstadter, and Scott Alexander. It’s fair to say that, overall, Pinker, Davis, Marcus, and Hofstadter were more impressed by GPT-3’s blunders, while we Scotts were more impressed by its abilities. (On the other hand, Hofstadter, more so than Pinker, Davis, or Marcus, said that he’s terrified about how powerful GPT-like systems will become in the future.)
Anyway, at some point Pinker produced an essay setting out his thoughts, and asked whether “either of the Scotts” wanted to share it on our blogs. Knowing an intellectual scoop when I see one, I answered that I’d be honored to host Steve’s essay—along with my response, along with Steve’s response to that. To my delight, Steve immediately agreed. Enjoy! –SA
I’m a huge fan of Pinker. How The Mind Works and The Language Instinct are two of my all-time favorite books. So I’m surprised and saddened to see him engaging in this debate for years without showing a familiarity with many of the core AI concepts, such as instrumental convergence and corrigibility.
I love his books too. It’s a real shame.
″...such as imagining that an intelligent tool will develop an alpha-male lust for domination.”
It seems like he really hasn’t understood the argument the other side is making here.
It’s possible he simply hasn’t read about instrumental convergence and the orthogonality thesis. What high quality widely-shared introductory resources do we have on those after all? There’s Robert Miles, but you could easily miss him.
Stuart Russell in the FLI podcast debate outlined things like instrumental convergence and corrigibility, though it took a backseat to his own standard/nonstandard model approach, and challenged him to publish reasons why he’s not compelled to panic in a journal, but warned him that many people would emerge to tinker with and poke holes in his models.
The main thing I remember from that debate is that Pinker thinks the AI xrisk community is needlessly projecting “will to power” (as in the nietzschean term) onto software artifacts.
“core AI concepts, such as instrumental convergence and corrigibility.”
Are those concepts core to AI in general or just to the LessWrong+ version of AI?
I get that. But there are lots of AI researchers who know little or nothing of discussions here. What’s the likelihood that they know or care about things like instrumental convergence and corrigibility?
Phrases like “AI safety” and “AI ethics” probably conjure up ideas closer to machine learning models with socially biased behavior, stock trading bot fiascos, and such. The Yudkowskian paradigm only applies to human-level AGI and above, which few researchers are pursuing explicitly.
Can you quote something specific where he makes a mistake?