Why I take short timelines seriously

I originally started writing this as a message to a friend, to offer my personal timeline takes. It ended up getting kind of long, so I decided to pivot toward making this into a post.

These are my personal impressions gathered while doing a bachelors and a masters degree in artificial intelligence, as well as working for about a year and a half in the alignment space.

AI (and AI alignment) has been the center of my attention for a little over 8 years now. For most of that time, if you asked me about timelines I’d gesture at an FHI survey that suggested a median timeline of 2045-2050, and say “good chance it happens in my lifetime.” When I thought about my future in AI safety, I imagined that I’d do a PhD, become a serious academic, and by the time we were getting close to general intelligence I would already have a long tenure of working in AI (and be well placed to help).

I also imagined that building AI would involve developing a real “science of intelligence,” and I saw the work that people at my university (University of Groningen) were doing as pursuing this great project. People there were working on a wide range of machine learning methods (of which neural networks were just one idea), logic, knowledge systems, theory of mind, psychology, robotics, linguistics, social choice, argumentation theory, etc. I heard very often that “neural networks are not magic,” and was encouraged to embrace an interdisciplinary approach to understanding how intelligence worked (which I did).

At the time, there was one big event that caused a lot of controversy: the success of AlphaGo (2016). To a lot of people, including myself, this seemed like “artificial intuition.” People were not very impressed with the success of DeepBlue in chess, because this was “just brute force” and this would obviously not scale. Real intelligence was about doing more than brute force. AlphaGo was clearly very different, though everyone disagreed on what the implications were. Many of my professors bet really hard against deep learning continuing to succeed, but over and over again they were proven wrong. In particular I remember OpenAI Five (2017/​2018) as being an extremely big deal in my circles, and people were starting to look at OpenAI as potentially changing everything.

There was this other idea that I embraced, which was something adjacent to Moravec’s paradox: AI would be good at the things humans are bad at, and vice versa. It would first learn to do a range of specialized tasks (which would be individually very impressive), gradually move toward more human-like systems, and the very last thing it would learn to do was master human language. This particular idea about language has been around since the Turing test: Mastering language would require general, human-level intelligence. If you had told me there would be powerful language models in less than a decade, I would have been quite skeptical.

When GPT happened, this dramatically changed my future plans. GPT-2 and especially GPT-3 were both extremely unnerving to me (though mostly exciting to all my peers). This was, in my view

  1. “mastering language” which was not supposed to happen until we were very close to human level

  2. demonstrating general abilities. I can’t overstate how big of a deal this was. GPT-2 could correctly use newly invented words, do some basic math, and a wide range of unusual things that we now call in-context learning. There was nothing even remotely close to this anywhere else in AI, and people around me struggled to understand how this was even possible.

  3. a result of scaling. When GPT-3 came out, this was especially scary, because they hadn’t really done anything to improve upon the design of GPT-2, they just made it bigger. Instead of there being strongly diminishing returns, as was always the case for other AI algorithms, it was clearly a massive improvement made entirely by just scaling it up. This was a slap in the face to the project of building a “science of intelligence,” and strong evidence that building AGI would be a lot easier than anyone around me had originally imagined.

I will say that, although this was enough evidence for me to give up my plans of getting a PhD, and led me to participate in the 2022 cohort of the AI Safety camp and eventually become more directly involved in the alignment community, I still held out a lot of hope that this was all a dramatic over-correction. GPT was pretty weird, and I felt a fair amount of sympathy for a kind of “stochastic parrot” story. My professors were quick to bring up ELIZA, and sure, maybe imitating human language wasn’t as big of a deal as I originally thought. People once thought that mastering chess would require general intelligence too.

During the AI safety camp I was mentored by janus, and that was my first interaction with people who had real experience interacting with language models. There I realized that GPT’s weirdness made it especially hard to properly evaluate its actual capabilities, and in particular, would lead people to systematically underestimate how capable it was. This post by Nostalgebraist does a good job of articulating this. The basic idea is that GPT wasn’t trained to do the things you are evaluating it on, and any proficiency at any of your intelligence metrics is just a coincidence. The metrics are just (weakly) correlated with accurately predicting text typical to the internet.

I went on to work for janus at Conjecture, and every day they would show me some GPT traces that blew my mind, and forced me to reevaluate what was possible. I didn’t understand what was really going on with GPT. I still continue to find GPT behavior extremely confusing and find it hard to draw strong conclusions from it. What felt unambiguously clear though, from everything they showed me (and what I basically already suspected since the GPT-2 paper), was that GPT was somehow doing general cognition. This part I find the hardest to justify to people, and I think there are others who are better at this. The core heuristic that compels me, however, is remembering just how absolutely inconceivable all of this was even 5 years ago.

I don’t know what my timelines are, and I’m annoyed at people (including janus sometimes) who put really specific numbers on their predictions. I remember showing the FHI survey to my thesis supervisor during my bachelors, and him responding in typical Dutch directness: “These numbers came from their ass.” That said, I’m even more baffled by anyone who is very confident that we won’t have some kind of takeoff this decade. Everything is moving so fast, and I don’t see any compelling evidence that things will significantly slow down.