My AI Predictions for 2027
(Crossposted from my Substack: https://taylorgordonlunt.substack.com/p/my-ai-predictions-for-2027)
I think a lot of blogging is reactive. You read other people’s blogs and you’re like, no, that’s totally wrong. A part of what we want to do with this scenario is say something concrete and detailed enough that people will say no, that’s totally wrong, and write their own thing.
--- Scott Alexander
I recently read the AI 2027 predictions[1] . I think they’re way off. I was visualizing my self at Christmastime 2027, sipping eggnog and gloating about how right I was, but then I realized it doesn’t count if I don’t register my prediction publicly, so here it is.
This blog post is mostly about me trying to register my predictions than trying to convince anyone, but I’ve also included my justifications below, as well as what I think went wrong with the AI 2027 predictions (assuming anything did go wrong).
My predictions for AI by the end of 2027
Nothing particularly scary happens (beyond the kind of hype-driven scariness already present in 2025).
AI is still not meaningfully self-improving.
People still use the term “superintelligence” to describe something that will happen in the future, not something that is already happening
AI research is not fully automated by AI, and they certainly won’t be so advanced at AI research that humans can’t even follow along.
AI will not have meaningful control over the day-to-day operation of companies, AI companies or otherwise.
AI does not start out-performing a majority of AI researchers or coders.
AI will not substantially speed up software development projects. For example, the AI 2027 prediction that 2025-quality games will be made in a single month by July 2027 is false.
I still believe my use of AI is less than a 25% improvement to my own productivity as a programmer, whether using full agentic AI or just chatbots. I still believe people who think AI is better than this are basically just mistaken.
AI have not taken a large number of jobs except in a few specific fields. (I am open to more hype-driven job difficulties faced by programmers, but not actual-capabilities-driven job loss for programmers.)
Large language models are still very stupid and make basic mistakes a 5-year-old would never make, as is true in 2025. Yet they are increasingly praised in the media for doing well on the SAT, math olympiad, etc., as in 2025.
LLMs are broadly acknowledged to be plateauing, and there is a broader discussion about what kind of AI will have to replace it.
LLMs still use text-based chain-of-thought, not neuralese.
Most breakthroughs in AI are not a result of directly increasing the general intelligence/”IQ” of the model, e.g. advances in memory, reasoning or agency. AI can stay on task much longer than before without supervision, especially for well-specified, simple tasks. Especially since AI coding platforms will have gotten better at tool use and allowing AI to manually test the thing they’re working on. By the end of 2027, AI can beat a wide variety of video games is hasn’t played before.
AI still can’t tell novel funny jokes, write clever prose, generate great business ideas, invent new in-demand products, or generate important scientific breakthroughs, except by accident.
There is more public discussion on e.g. Hacker News about AI code rot and the downsides of using AI. People have been burned by relying too much on AI. But I think non-coders running businesses will still by hyped about AI in 2027.
Useful humanoid robots do not emerge (other than as demos and so on).
AI still can’t drive a damned car well enough that if I bought a car I wouldn’t have to.
AI are not writing good books or well-respected articles, but they have gotten better at mediocrity, and slop articles and comments are becoming a real problem. It’s really hard to figure out what percentage of e.g. Reddit comments are currently AI generated, so I can’t put a number on this other than to say it becomes noticeably more of a problem.
AI girlfriends/boyfriends/companions do explode in popularity (the market at least doubles compared to now). It becomes common for particularly lonely young people to spend time interacting with them. This is driven as much by the loneliness epidemic as by AI progress, and the meaningful improvements that lead to this are in AI voice natural-ness and memory, not intelligence.
Similarly, LLM-driven NPCs in some AAA games start to show up, but are still not common.
None of the sci-fi Hollywood stuff in the AI 2027 predictions come true. AI safety teams do not bring back an older model for safety reasons. There is no government oversight committee that discusses pausing/slowing AI that actually has the power to do that. The president seriously discussing nationalizing AI, CCP spies stealing model weights, plans for attacks on Chinese data centers, humans on the verge of losing control of superhuman AI—none of this happens.
We will still have no idea how to safely align AI.
Justification
Why am I so pessimistic?
The primary reason is that I think LLMs will plateau.[2] I think there are two “complexity classes” of human thinking, and LLMs are only good at one of them.[3] Let’s call them deep and shallow thinking.
Shallow vs. Deep Thinking
Shallow thinking is the kind of automatic, easy thinking you do when you have a regular conversation, and you’re just blurting out whatever comes to mind[4]. I’m also including effortfully manipulating information within consciousness as shallow thinking. For example, doing arithmetic, however complex, is shallow thinking. Shallow thinking is any thinking that requires simple recall and basic linear/logical information processing.
Deep thinking is what happens when your unconscious mind makes new, useful connections.[5] Humans cannot do it consciously. Doing it unconsciously feels like listening to a “muse”. Clever jokes come from deep thinking, as do brilliant ideas. Nobody has identified as step-by-step process to generate funny jokes, because such a process would probably be exponential in nature and take forever. Instead, the unconscious mind does this in some kind of parallel way.
LLMs are terrible at deep thinking. All of the recent gains in AI capabilities have come from making them better and better at shallow thinking.
Noticing Where LLMs Fail
A seasoned mechanic might know exactly what’s wrong with your engine based on the clicking noise it makes. If you ask him how he knows, he’ll shrug. But he’s right. Probably. My own intuition about LLMs comes from working with them a lot on real-world coding problems. Not toy problems. Not benchmarks. Not English essays. You can BS your English teacher, but you can’t BS reality. I get to see the situations in which LLMs trip and slam their faces against reality. One minute, they seem smart—even brilliant—then the next, they’re like a shrimp that knows Python. From working with them, I’ve gotten a sense for what kinds of problems LLMs fail at. (Yep, my predictions are based on subjective intuition. I can’t prove anything. I’m just guessing.)
LLMs obviously do well at problems which closely match the training data. It’s definitely easier to solve a problem if you’ve already memorized the answer! It’s not always easy to tell if a problem is novel or not, but for novel problems, it seems to me they struggle most with deep thinking problems. As they have scaled up, they’ve gotten much better at shallow thinking (including arithmetic), but not much better at deep thinking.
Benchmarks sound like a way to see how well LLMs do when up against reality, but I don’t think they really are. Solving SAT problems or Math Olympiad problems only involves deep thinking if you haven’t seen millions of math problems of a broadly similar nature. Given that the SATs and the Math Olympiad are designed to be solvable by at least some high school students in a few hours, they probably don’t include problems that would require an LLM to do any deep thinking. Recall + shallow thinking would be enough. This is why LLMs have improved more on benchmarks than in reality.
My position is essentially the “LLMs are glorified autocomplete” hypothesis, except asserting that 90% of human cognition is also glorified autocomplete, and you can get pretty far with that kind of thinking actually. But it’s not the kind of thinking that leads to clever jokes, good business ideas, scientific breakthroughs, etc.
Why the Architecture of LLMs Makes Them Bad at Deep Thinking: They’re Too Wide
GPT-3 is 96 layers deep (where each layer is only a few “operations”), but 49,152 “neurons” wide at the widest. This is an insanely wide, very shallow network. This is for good reasons: wide networks are easier to run efficiently on GPUs, and apparently deep networks are hard to train.
An obvious problem with this architecture is you can’t really solve problems that require more than 96 steps (per token). Completing the next token of “23 + 15 =” might be solvable, but completing the next token of “The stopping time for the number 27 in the Collatz conjecture is: ” will not be (unless the answer is memorized).
A less obvious problem with this architecture is that the model is so big that instead of forming a coherent, hierarchical world model, it can just cram a bunch of training information into the model, outputting right answers by regurgitating memorized patterns. They do form generalizations and abstractions, but not to the extent humans do. We humans can’t just memorize the entire internet, or all of our sense data. Instead, we have to form a deep understanding of the world by boiling reality down to its most important pieces, and throwing the rest away. A very wide LLM doesn’t have to do that as much.
The rules of logic are implemented once in the human brain. Instead, LLMs might have hundreds of half-representations of the rules of logic throughtout their gargantuan minds. LLMs might have the implicit notion that correlation is not equal to causation inside whatever corner of its mind handles psychology research, then have a different representation of this same concept of correlation not being equal to causation in the part of its mind responsible for A/B testing software. It doesn’t have to learn the fully general concept of correlation vs. causation. This lack of centralization makes forming unique connections between disperate concepts tough, because the links between them that should be there aren’t. With LLMs, we were hoping to get one super-genius, but we ended up with 10,000 morons stapled together.
LLMs Are Also Too Linear
A further problem with LLMs is the lack of recurrence. It’s a linear network. Information goes through one layer, then the next, then the next. This makes exploring deep combinatorial spaces hard. When humans do deep thinking, our unconscious minds have to explore many paths through idea space, perhaps many in parallel. We have to be able to realize a path isn’t bearing fruit and backtrack, trying different paths.
LLMs, on the other hand, are feed-forward networks. Once an LLM decides on a path, it’s committed. It can’t go back to the previous layer. We run the entire model once to generate a token. Then, when it outputs a token, that token is locked in, and the whole model runs again to generate the subsequent token, with its intermediate states (“working memory”) completely wiped. This is not a good architecture for deep thinking.
Chain of thought is a hack that helps to add some artificial reflexivity to an otherwise non-recurrent model. Instead of predicting the next token of the answer, models predict the next token of a sequence of reasoning steps to get the answer. This makes it easier for the model to say, “actually no, that’s not right, let me try a different approach,” if it’s current committed path is a bad one. (I have a pet theory that this is what leads LLMs to overuse em dashes. They’re a good way for one run of an LLM that disagrees with the previous run to pivot away from what the previous run was trying to say—or not!)
I’m sure we’ve all seen LLMs repeatedly change approaches as they flounder on a difficult problem. Chain of thought is better than nothing, but it’s a far cry from how humans reason.
Imagine you had to solve a deep problem, but you were forced to pause your thinking every ten seconds. After every ten seconds of thinking, you had to write down one word and then have your memory of the problem wiped, conscious and unconscious. At the beginning of the next ten seconds, all you’d have access to is the words you’d written so far. Every deep exploration into combinatorial space would be cut short and have to begin anew. All your implicit memory about which paths seemed promising and which went nowhere would be wiped, as would any mental abstractions you’d formed along the way. If your mind operated like this, solving deep problems would be a nightmare. Encoding all your unconscious, parallel thinking into a single word is a hopeless endeavor. But this is how LLMs work!
Some people have suggested allowing LLMs to store a larger amount of non-linguistic information between runs (“neuralese recurrence”). The authors of AI 2027 predict this will happen in March 2027, by the way. This would help, but unless your “neuralese” memory is approximately the size of the entire LLM’s intermediate state, information is still being lost each run. Better to simply allow the model to have a persistent state across runs, though I’m not sure it’s right to call it an LLM anymore at that point. Improving the ability of LLMs to do deep thinking is not a simple matter of scaling up or tweaking the architecture, or more out-of-model hacks.
If I’m right about the architecture of LLMs being ill-suited to deep thinking, then it won’t be 2 years before we have superintelligent AI. It won’t happen at all until we switch architectures. And that won’t be simple. Recurrent models are hard to train and more expensive to run because they’re hard to parallelize. (And from what I understand, recurrent models themselves currently have similar information-bottleneck issues that would need to be addressed, perhaps along with new hardware that doesn’t separate compute from memory.)
Maybe I’m wrong. I hope not! Either way, it doesn’t change what we should do about AI safety at all. Finding out that the bad guy is going to shoot you in the head in 15 minutes instead of 5 doesn’t change your behaviour much. AI is eventually going to be a major problem, even if it won’t be in 2027.
What’s wrong with AI 2027
In this blog post I mostly wanted to share my own predictions and my reasons behind them, not talk about the AI 2027 predictions. But I wanted to include a section discussing some of the problems I have with them, because I think the problems with AI 2027 speak to common problems people have when reasoning about and predicting the future—problems that are actually related to the two kinds of thinking I outlined above.
The AI 2027 predictions[6] are based on a few different key forecasts. I think their forecasting of compute increases for AI companies are basically reasonable. Where I think the forecasting goes wrong is in the Timelines Forecast (how long until we get superhuman coders?) and the Takeoff Forecast (how long from superhuman coders until subsequent milestones like artificial superintelligence?).
I will confess I didn’t read the entire AI 2027 document, since it’s 193 pages of data/theory/evidence, and I don’t have time for that. More than that, I am immediately skeptical of a document that requires 193 pages to support its main conclusions. Frankly, 193 pages is something to apologize for, not brag about. It is very easy to bury sloppy methodology inside a giant document, and I think that basically happened here. If you can’t explain your argument in a few paragraphs or maybe a few pages, you’re probably relying on a giant chain of reasoning, each link of which introduces further uncertainty until the whole thing is worthless. For brevity I’ll only go into detail about the Takeoff Forecast.
The Takeoff Forecast is Based on Guesswork
It was all very fancy and involved some Serious Mathematics and Statistics, so it took me a while to even parse the arguments. But it seems to boil down to this: the authors each guestimate how likely they think something is, then do some math to combine their guestimates. They do this repeatedly for each step of progress, forming a chain of predictions.
Let’s take one of the links in the chain: going from superhuman coders (SC) to superhuman AI researchers (SAR). They operationalize these terms in the document in a specific and robust way.[7]
They estimate it will take 0.3 years to go from SC to SAR. First, they estimate it would take a human coder 3-4 years to do so, then they predict that using SC would speed up AI research and development by 5x. Thereby, they arrive at an estimate of 0.3 years.
Similarly, they say going from SAR to the next step (superintelligent AI researcher, SIAR) would take human coders 19 years, but with the 25x speedup SARs bring, it will only take a few months.
This reasoning is fine. The critical part is the underlying assumptions: Why would it take a human 19 years to get from SAR to SAIR, as opposed to say, 1 month or 1000 years? And why would SAR give a 25x speedup vs. a 2.5x speedup or a 2500x speedup? These numbers are critical to the whole forecast, so let’s find out where they came from.
In the case of SC → SAR, they break it down into four possibilities:
“The first SC is already an SAR or very close to one.” 15%. In this case, they guess SC → SAR takes 0 years.
“Training the missing SAR skills isn’t more compute intensive than the SC.” 25%. In this case, they guess SC → SAR takes 2 years.
“Training the missing SAR skills would be more compute-intensive than the SC absent further algorithmic progress.” 30%. In this case, they guess SC → SAR takes 5 years.
“Crossing the gap from SC to SAR is a scientific rather than an engineering problem.” 30%. In this case, they guess SC → SAR takes 5 years.
They do math on these percentages and timelines to get an overal estimate of 15% 0 years, otherwise 4 years. As long as the percentages and years are correct in the above list of possibilities, then this overall estimate will be correct too. The crucial part are those percentages and years estimates. Where do these come from?
As an example, for possibility #3, this is their reasoning:
We’d guess that this is the sort of limitation that would take years to overcome — but not decades; just look at the past decade of progress e.g. from AlphaGo to EfficientZero.” Remember, we are assuming SC is reached in Mar 2027. We think that most possible barriers that would block SAR from being feasible in 2027 would also block SC from being feasible in 2027.
Umm, what happened to all the precise mathematics we had so far? This is just vibes-based guesswork of the same kind I did when I made my own predictions! I thought we were “estimating” and “forecasting”, not just going with gut feelings!
They say there is a 30% chance that going from SC to SAR “would be more compute-intensive than the SC”. Why? Why not 0.001% or 99.999%.
I suspect it’s because when you know practically nothing about a subject that would let you form a high quality prediction, it would feel weird to make such a specific, confident guess. So instead you pick percentages that are roughly equivalent to the number of options you’re choosing from, in the same way someone who doesn’t know how lotteries and probability work might think there’s a 50% chance they’ll win the lottery. The fact the percentages here are so close to 25% (four options = 25% each) should reveal just how low confidence and arbitrary they are.
There are different kinds of 50% guesses. There are high-confidence 50% guesses (flipping a coin), and low-confidence 50% guesses (is God real?) I suspect the guesses they are giving here are in the latter category.
In the case where “crossing the gap from SC to SAR is a scientific rather than an engineering problem”, why should it take 5 years, and not one month or 1000 years? They justify it only with guesswork.
It took us a while to dig into this one link in the chain, but it turns out that under all that fancy math, it’s just intuition-based guesswork.
I Don’t Take These Predictions Seriously
The authors do admit to “wide error margins”, but their distributions are still centered around ~2027, which I think is an absurd estimate. Saying they’re not sure, they’re highly uncertain, giving a probability distribution instead of just a median estimate; none of that changes the fact that they’re predicting doomsday in ~2027.
I am sure that combining the estimates of several forecasters is more likely to be correct than simply taking one of their guesses randomly (duh). But averaging guesses also seems like terribly inaccurate methodology, not unlikely to be off by many orders of magnitude, especially since the guesses are likely correlated and possibly wrong in the same direction. Basically, this methodology does not convert the “guesswork” into “forecasting”. It’s still guesswork.
I felt similarly about all the other links in the chain when I looked into them. The entire chain of reasoning in AI 2027 is a shining tower of statistics and logic based on a foundation of sand. Maybe SC → SAR takes zero minutes. Maybe it takes centuries or millenia. I do not believe the authors know any better than I do, and I don’t think all of the fancy analysis and methodology adds anything when it’s built on an extremely low-quality, low-confidence guess.
I especially don’t like that it isn’t clear to people reading the AI 2027 predictions that they’re just guesses, not the detailed forecasts they appear to be. (Except perhaps to the bold few like me who decide to crack open the 193 pages of justification and see what it’s based on!).
The Presentation was Misleading
Nothing wrong with guesswork, of course, if it’s all you’ve got! But I would have felt a lot better if the front page of the document had said “AI 2027: Our best guess about what AI progress might look like, formulated by using math to combine our arbitrary intuitions about what might happen.”
But instead it claims to be based on “trend extrapolations, wargames, expert feedback, experience at OpenAI, and previous forecasting successes”, and links to 193 pages of data/theory/evidence. That makes you think when you open up the 193 pages of research, you’ll see specific predictions based on data like, “AI will likely proceed from SC to SAR in 0.3 years because SAR’s architecture requires us to build a neuralize memory adapter for agents, and we believe this will take 0.3 years because blah blah blah...” But there’s nothing like that. It’s all just based on vibes.
They never outright stated it wasn’t based on vibes, of course, and if you dig into the document, that’s what you find out. They openly admit they “are highly uncertain”.
Yet, imagine I told you I was “highly uncertain” about when I was going to die, but estimated I was going to die on Thursday, January 5th, 2073 at three seconds past 4:18 pm. And then I showed you a lognormal graph which peaks sharply in 2073. And then imagine the news reported on my statement as “rigorously researched”, and many very intelligent people held the statement in high regard. You might get the impression I was a lot more certain than I am.
The whole AI 2027 document just seems so fancy and robust. That’s what I don’t like. It gives a much more robust appearance than this blog post, does it not? But is it any better? I claim no. We shall see.
Deep Thinking vs. Shallow Thinking For Making Predictions
I assert if they’d put more effort into the intuition and less into the post-processing and justification of that intuition, they’d have arrived at more accurate estimates. AI 2027 cites Daniel Kokotajlo’s previous 2021 estimates for AI up to 2026. His “lower-effort” predictions from 2021 were quite accurate. Why?
I think it’s because he was doing deep, intuitive thinking when he formed those predictions. Ironically, the higher-effort AI 2027 prediction looks a lot less reasonable to me because it leans too heavily on shallow thinking (logic, reasoning, and extrapolation).
Listening to several of the authors discuss the AI 2027 predictions after they were published leads me to believe they don’t intuitively believe their own estimates. I won’t try to read their minds, but I think it’s worth mentioning: Even if you can’t spot any flaws in an argument, you shouldn’t believe it’s convincing if it hasn’t actually convinced you.[8]
Shallow thinking can be very useful, but it’s a recipe for disaster when broadly predicting the future. Shallow thinking can only take into account a small amount of information at a time, and the present and future are very complex. Once you’re doing your reasoning in consciousness, you’re limited on the amount of information you can process, and it’s easy to form a fragile chain of logic, any piece of which could make the whole chain useless.
The alternative is deep thinking, which will either be amazing or terrible depending simply on how deep and accurate of an intuition you have for the subject. Logic/shallow thinking is then best used as a fact-checking and communication step.
Was AI 2027 a Valuable Exercise?
Many others have said that while they don’t really buy the actual predictions made by AI 2027, they think writing a concrete AI timeline scenario was a valauble exercise, and that it’s good to “stir up fear about AI so that people will get off of their couches and act”[9]
I am undecided.
Having people believe there is a credible doomsday scenario is possibly good for AI alignment, true. More eyeballs on the AI alignment problem is a good thing, perhaps even if you have to outright lie to get it. If all you have to do is handwave, then even better.
But I do worry about what happens in 2028, when everyone realizes none of the doomsday stuff predicted in 2025 actually came true, or even came close. Then the AI alignment project as a whole may risk being taken as seriously as the 2012 apocalypse theory was in 2013. The last thing you want is to be seen as crackpots.
Conclusion
Predicting the future is hard. I may be totally wrong. If it turns out I’m right and AI 2027 is wrong, I think the takeaway will be that it’s a good idea to be skeptical of fancy, academic reasoning.
Science doesn’t work because of probability distributions and eigenvalues, it works because you’re going out and gathering evidence to find out what’s true about reality. All the statistics mumbo-jumbo is actually in the way of finding out the truth, and you should do as little of it as actually necessary to get value out of your data. In this case, there is no data.[10] A fact which is made unclear by the elaborate, official-seeming presentation and compelling fictional scenario.
- ↩︎
AI 2027 predictions by Daniel Kikotajlo, Scott Alexander, Thomas Larsen, Eli Lifland, Romeo Dean. Published in April 2025.
- ↩︎
Actually, I think LLMs are already plateauing, but focus on out-of-model (agency) or reasoning progress has covered this up by giving AI extra capabilities without substantially improving the model’s actual within-model intelligence.
- ↩︎
Scott Aaronson points out that if P=NP, the “world would be a profoundly different place than we usually assume it to be. There would be no special value in “creative leaps,” no fundamental gap between solving a problem and recognizing the solution once it’s found. Everyone who could appreciate a symphony would be Mozart; everyone who could follow a step-by-step argument would be Gauss; everyone who could recognize a good investment strategy would be Warren Buffett.” This is basically the distinction I am referring to here.
- ↩︎
Therapy would be one example of conversations that involve more deep thinking. Long pauses are an indication lots of deep thinking is happening.
- ↩︎
This is not quite the distinction between System 1 and System 2 thinking, from what I understand. Shallow thinking encompasses all System 2 (consicous) thinking and some System 1 thinking, and deep thinking is just the kind of System 1 thinking you do when focusing. The point is that shallow thinking is algorithmically/computationally less demanding. I don’t actually know if shallow and deep thinking are fundamentally different, but it doesn’t matter for this discussion. It only matters that they’re practically different.
- ↩︎
By the way, Daniel Kokotajlo has adjusted his estimates slightly from 2027 to 2028 since publishing in response to some criticism. I am responding to the original estimates, though the original estimates had large error margins anyway, and I don’t think shifting from 2027 to 2028 meaningfully changes any of my criticisms.
- ↩︎
SC: An AI system that can do the job of the best human coder on tasks involved in AI research but faster, and cheaply enough to run lots of copies.” Specifically: “An AI system for which the company could run with 5% of their compute budget 30x as many agents as they have human researchers, each which is on average accomplishing coding tasks involved in AI research… at 30x the speed… of the company’s top coder.” SAR: “An AI system that can do the job of the best human AI researcher but faster, and cheaply enough to run lots of copies...” Specifically: “An AI system that can do the job of the best human AI researcher but 30x faster and with 30x more agents, as defined above in the superhuman coder milestone...” I think these definitions are basically fine even though they assume coders are basically fungible, only differing in the speed it takes to accomplish a task.
- ↩︎
For an example of someone’s intuition conflicting with their conscious analysis, read my new short story Suloki and the Magic Stones
- ↩︎
- ↩︎
Or at least, no good data, at least not for the Timeline/Takeoff sections.
Thanks for writing this up, glad to see the engagement! I’ve only skimmed and have not run this by any other AI 2027 authors, but a few thoughts on particular sections:
I agree with most but not all of these in the median case, AI 2027 was roughly my 80th percentile aggressiveness prediction at the time.
Edited to add, I feel like I should list the ones that I have <50% on explicitly:
I disagree re: novel funny jokes, seems plausible that this bar has already been passed. I agree with the rest except maybe clever prose, depending on the operationalization.
Disagree but not super confident.
I disagree with the first clause, but I’m not sure what you mean because advances in reasoning and agency seem to me like examples of increases in general intelligence. Especially staying on task for longer without supervision. Are you saying that these reasoning and agency advances will mostly come from scaffolding rather than the underlying model getting smarter? That I disagree with.
Disagree on the first two sentences.
I don’t follow self-driving stuff much, but this might depend on location? Seems like good self-driving cars are getting rolled out in limited areas at the moment.
As you touch on later in your post, it’s plausible that we made a mistake by focusing on 2027 in particular:
I think this is a very reasonable concern and we probably should have done better in our initial release making our uncertainty about timelines clear (and/or taking the time to rewrite and push back to a later time frame, e.g. once Daniel’s median changed to 2028). We are hoping to do better on this in future releases, including via just having scenarios be further out, and perhaps better communicating our timelines distributions.
Also:
What do you mean by this? My guess is that it’s related to the communciation issues on timelines?
Agree.
I very much understand this take and understand where you’re coming from because it’s a complaint I’ve had regarding some previous timelines/takeoff forecasts.
Probably some of our disagreement is very tied-in to the object-level disagreements about the usefulness of doing this sort of forecasting; I personally think that although the timelines and takeoff forecasts clearly involved a ton of guesswork, they are still some of the best forecasts out there, and we need to base our timelines and takeoff forecasts on something in the absence of good data.
But still, since we both agree that the forecasts rely on lots of guesswork, even if we disagree on their usefulness, we might be able to have some common ground when discussing whether the presentation was misleading in this respect. I’ll share a few thoughts from my perspective below:
I think it’s a very tricky problem to communicate that we think that AI 2027 and its associated background research is some of the best stuff out there, but is still relying on tons of guesswork because there’s simply not enough empirical data to forecast when AGI will arrive, how fast takeoff will be, and what effects it will have precisely. It’s very plausible that we messed up in some ways, including in the direction that you posit.
Keep in mind that we have to optimize for a bunch of different audiences, I’d guess that for each direction (i.e. taking the forecast too seriously, vs. not seriously enough) many people came away with conclusions too far in that direction, from my perspective. This also means that some others have advertised our work in a way that seems overselling to me, though others have IMO undersold it.
As you say, we tried to take care to not overclaim regarding the forecast, in terms of the level of vibes it was based on. We also explicitly disclaimed our uncertainty in several places, e.g. in the expandables “Why our uncertainty increases substantially beyond 2026” and “Our uncertainty continues to increase.” as well as “Why is it valuable?” right below the foreword.
Should we have had something stronger in the foreword or otherwise more prominent on the frontpage? Yeah, perhaps, we iterated on the language a bunch to try to make it convey all of (a) that we put quite a lot of work into it, (b) that we think it’s state-of-the-art or close on most dimensions and represents subtantial intellectual progress, but also (c) giving the right impression about our uncertainty level and (d) not overclaiming regarding the methodology. But we might have messed up these tradeoffs.
You proposed “AI 2027: Our best guess about what AI progress might look like, formulated by using math to combine our arbitrary intuitions about what might happen.” This seems pretty reasonable to me except as you might guess I take issue with the connotation of arbitary. In particular, I think there’s reason to trust our intuitions regarding guesswork given that we’ve put more thinking time into this sort of thing than all but a few people in the world, our guesswork was also sometimes informed by surveys (which were still very non-robust, to be clear, but I think improving upon previous work in terms of connecting surveys to takeoff estimates), and we have a track record to at least some extent. So I agree with arbitrary in some sense in that we can’t ground out our intuitions into solid data, but my guess is that it gives the wrong connotation in terms of to what weight the guesswork should be given relative to other forms of evidence,
I’d also not emphasize math if we’re discussing the scenario as opposed to timelines or takeoff speeds in particular.
My best guess is for the timelines and takeoff forecast, we should have had a stronger disclaimer or otherwise made more clear in the summary that they are based on lots of guesswork. I also agree that the summaries at the top had pretty substantial room for improvement.
I’m curious what you would think of something like this disclaimer in the timelines forecast summary (and a corresponding one in takeoff): Disclaimer: This forecast relies substantially on intuitive judgment, and involves high levels of uncertainty. Unfortunately, we believe that incorporating intuitive judgment is necessary to forecast timelines to highly advanced AIs, since there simply isn’t enough evidence to extrapolate conclusively.
I’ve been considering adding something like this but haven’t quite gotten to it due to various reasons, but potentially I should prioritize it more highly.
We’re also working on updates to these models and will aim to do better at communicating in the future! And will take into account suggestions.
I think this might have happened because to us it’s clear to us that we can’t make these sorts of forecasts without tons of guesswork, and we didn’t have much slack in terms of the time spent thinking about how these supplements would read to others; I perhaps made a similar mistake to one that I have previously criticized others for.
(I had edited to add this paragraph in, but I’m going to actually strike it out for now because I’m not sure I’m doing a good job accurately representing what happened and it seems important to do so precisely, but I’ll still leave it up because I don’t want to feel like I’m censoring something that I already had in a version of the comment.)
Potentially important context is that our median expectation is that AI 2027 would do much worse than it did, so we were mostly spending time trying to increase the expected readership (while of course following other constraints like properly disclaiming uncertainty). I think we potentially should have spent a larger fraction of our time thinking “if this got a ton of readership then what would happen” and to be clear we did spend time thinking about this, but I think it might be important context to note that we did not expect AI 2027 to get so many readers so a lot of our headspace was around increasing readership.Linking to some other comments I’ve written that are relevant to this: here, here
Thank you for taking the time to write such a detailed response.
My main critique of AI 2027 is not about communication, but the estimates themselves (2027 is an insane median estimate for AI doom) and that I feel you’re overconfident about the quality/reliability of the forecasts. (And I am glad that you and Daniel have both backed off a bit from the original 2027 estimate.)
Probably this is related to communication issues on timelines, yes. Also, I think if I genuinely believed everyone I knew and loved was going to die in ~2 years, I would probably be acting a certain way that I don’t sense from the authors of the AI 2027 document. But I don’t want to get too much into mind reading.
With respect to the communication issue, I think the AI 2027 document did include enough disclaimers about the authors’ uncertainty, and more disclaimers wouldn’t help. I think the problem is that the document structurally contradicts those disclaimers, by seeming really academic and precise. Adding disclaimers to the research sections would also not be valuable simply because most people won’t get that far.
Including a written scenario is something I can understand why you chose to do, but it also seems like a mistake for the reasons I mentioned in my post. It makes you sound way more confident than we both agree you actually are. And a specific scenario is also more likely to be wrong than a general forecast.
You have said things like:
“One reason I’m hesitant to add [disclaimers] is that I think it might update non-rationalists too much toward thinking it’s useless, when in fact I think it’s pretty informative.”
“The graphs are the result of an actual model that I think is reasonable to give substantial weight to in one’s timelines estimates.”
“In our initial tweet, Daniel said it was a ‘deeply researched’ scenario forecast. This still seems accurate to me.”
“we put quite a lot of work into it”
“it’s state-of-the-art or close on most dimensions and represents subtantial intellectual progress”
“In particular, I think there’s reason to trust our intuitions”
As I said in my post, “The whole AI 2027 document just seems so fancy and robust. That’s what I don’t like. It gives a much more robust appearance than this blog post, does it not? But is it any better? I claim no.”
I don’t think your guesses are better than mine because of the number of man hours your put into justifying them, nor because the people who worked on the estimates are important, well-regarded people who worked at OpenAI or have a better track record, nor because the estimates involved surveys, wargames, and mathematics.
I do not believe your guesses are particularly informative, nor do I think that about my own guesses. We’re all just guessing. Nor do I agree with calling them forecasts at all. I don’t think they’re reliable enough that anybody should be trusting them over their own intuition. In the end, neither of us can prove what we believe to a high degree of confidence. The only thing that will matter is who’s right, and none of the accoutrements of fancy statistics, hours spent researching, past forecasting successes, and so on will matter.
Putting too much work into what are essentially guesses is also in itself a kind of communication that this is Serious Academic Work—a kind of evidence or proof that people should take very seriously. Which it can’t be, since you and I agree that “there’s simply not enough empirical data to forecast when AGI will arrive”. If that’s true, then why all the forecasting?
(All my criticism is about the Timelines/Takeoff Forecasting, since these are things you can’t really forecast at this time. I am glad the Compute Forecast exists, and I didn’t read the AI Goals and Security Forecasts)
Okay, it sounds like our disagrement basically boils down to the value of the forecasts as well as the value of the scenario format (does that seem right?), which I don’t think is something we’ll come to agreement on.
Thanks again for writing this up! I hope you’re right about timelines being much longer and 2027 being insane (as I mentioned, it’s faster than my median has ever been, but I think it’s plausible enough to take seriously).
edit: I’d also be curious for you to specify what you mean by academic? The scenario itself seems like a very unusual format for academia. I think it would have seemed more serious academic-y if we had ditched the scenario format.
Perhaps we will find some agreement come Christmastime 2027. Until then, thanks for your time!
edit: Responding to your edit, by seeming academic, I meant things like seeming “detailed and evidence-based”, “involving citations and footnotes”, “involving robust statistics”, “resulting in high-confidence conclusions”, and stuff like that. Even the typography and multiple authors makes it seem Very Serious. I agree that the scenario part seemed less academic that the research pages.
Reading further it seems like you are basically just saying “Timelines are longer than 2027.” You’ll be interested to know that we actually all agree on that. Perhaps you are more confident than us; what are your timelines exactly? Where is your 50% mark for the superhuman coder milestone being reached? (Or if you prefer a different milestone like AGI or ASI, go ahead and say that)
Unfortunately, it’s hard to predict it. I did describe how Grok 4[1] and GPT-5 are arguably evidence that the accelerated doubling trend between GPT4o and o3 is replaced by something slower. As far as I understand, were the slower trend to repeat METR’s original law (GPT2-GPT4?[2]), we would obtain the 2030s.
But, as you remark, “we should have some credence on new breakthroughs<...> that would lead to superhuman coders within a year or two, after being appropriately scaled up and tinkered with.” The actual probability of the breakthrough is likely a crux: you believe it to be 8% a year and I think of potential architectures waiting to be tried. One such architecture is diffusion models[3] which have actually been previewed and could be waiting to be released.
So assuming world peace, the timeline could end up being modeled by a combination of scaling compute up and few algorithmic breakthroughs with random acceleration effects, and each breakthrough would have to be somehow distributed by the amount of research done, then have the most powerful Agent trained to use the breakthrough, as happens with Agent-3 and Agent-4 created from Agent-2 in the forecast.
Maybe a blog post explaining more about your timelines and how they’ve updated would help?
The worse-case scenario[4] also has timelines affected by compute deficiency. For instance, the Taiwan invasion is thought to happen by 2027 and could be likely to prompt the USG to force the companies to merge and to race (to AI takeover) as hard as they can.
Grok 4 is also known to have been trained by spending similar amounts of compute on pretraining and RL. Is it also known about GPT-5?
GPT-4 and GPT-4o were released in March 2023 and May 2024 and had only one doubling in 14 months. Something hit a plateau, then in June 2024 Anthropic released Claude 3.5 Sonnet (old), and a new trend began. As of now, the trend likely ended at o3, and Grok 4 and GPT5 are apparently in the same paradigm which could have faced efficiency limits.
They do rapidly generate text (e.g. code). But I don’t understand how they, say, decide to look important facts up.
Of course, the absolute worst scenario is a black swan like currency collapse or China’s responce with missile strikes.
Yeah we are working on it sorry!
Hey Daniel, I loved the podcast with Dwarkesh and Scott Alexander. I am glad you have gotten people talking about this, though I’m of two minds about it, because as I say in my post, I believe your estimates in the AI 2027 document are very aggressive (and there were some communication issues there I discussed with Eli in another comment). I worry what might happen in 2028 if basically the entire scenario described on the main page turns out to not happen, which is what I believe.
My blog post is a reaction to the AI 2027 document as it stands, which doesn’t seem to have any banner at the top saying the authors no longer endorse the findings or anything. The domain name and favicon haven’t changed. I am now aware that you have adjusted your median timeline from 2027 to 2028, and that Eli has adjusted his estimate as well. The scenario itself describes some pretty crazy things happening in 2027, and the median estimate in the research (before adjustments) seems to be 2027 for SC, SAR, SAIR, and ASI all in 2027. I definitely don’t agree that any of those milestones will be reached in 2027, nor in 2028.
Of course, predicting the future is hard the further out in time we go. I have a strongish sense for 2027, but I would put SC, SAR, SAIR, and ASI all at least ten years further than that, and it’s really hard to know what the heck will be going on in the world by then, because unlike you I don’t believe this is a matter of continuous progress (for reasons mentioned in my post), and discontinuous progress is hard to predict. In 1902 Simon Newcomb said, “Flight by machines heavier than air is unpractical and insignificant, if not utterly impossible.” The next year the Wright brothers took off. I’m not sure trying to put a number on it is better than simply saying, “I don’t know.”
In the spirit of answering the question you asked, I’d predict SC in the year 2050. But this is such a low-confidence prediction as to be essentially worthless, like saying I think there’s a 50% God exists.
I’d like to know what you thought of my justification for my doubts about LLMs, if you had time to get that far.
Thanks for the critique & the reply btw! Very much appreciate you giving quantitative alternative credences to mine, it’s a productive way to focus the conversation I think.
Footnote #1 used to be attached to “We wrote a scenario...” but we wanted it to be more prominent in response to exactly the sort of criticism you are making, so we moved it up to be literally a footnote on the title. I suppose we could have put it in the main text itself.
My median is currently 2029 actually; at the time AI 2027 was published it was 2028.
OK, thanks, that’s helpful. So yeah, while we agree that SC probably won’t happen by 2027 EOY, we do still have a disagreement—I think it probably WILL happen in the next five years or so (and the rest of the team thinks it’ll probably happen in the next ten years or so) whereas you seem confident it WON’T happen before 2037. I hope you are right! I agree also that the future is very hard to predict, especially the farther out it is (and 2037+ is very far out)
There’s a lot to say about why I think SC will probably happen in the next five years or so. I’ll go leave line-by-line comments in the relevant section of your post!
Yeah, someone pointed out that footnote to me, and I laughed a bit. It’s very small and easy to miss. I don’t think you guys actually misrepresented anything. It’s clear from reading the research section what your actual timelines are and so on. I’m just pointing to communication issues.
Thanks for your responses! I’ll check them out.
Talking about 2027, the authors did inform the readers in a footnote, but revisions of the timelines forecast turned out to be hard to deliver to the general public. Let’s wait for @Daniel Kokotajlo to state his opinion on the doubts related to SOTA architecture. In my opinion these problems would be resolved by a neuralese architecture or an architecture which could be an even bigger breakthrough (neuralese with big internal memory?)
I basically agree that lack of neuralese/recurrence/etc. is probably significantly hampering AI capabilities (though ofc, it clearly has benefits for training efficiency otherwise the companies wouldn’t be doing it) however I’m not so convinced that CoT alone can’t get us to the SC milestone. A direct reply to your argument: Yes, it would suck if you had to have your memory wiped and write stuff down every ten seconds. But that’s partly because you haven’t been trained to live that way! Imagine instead the following hypothetical:
--Your natural brain short-term memory (but not long-term memory) is wiped every ten seconds. But you have a Neuralink device and retinal implant, and the neuralink device lets you ‘think’ arbitrary text into a text file, and the text is displayed on your retinal implant. At first, this would suck. But after years and years of training, you’d learn to use the text to think, much like ‘reasoning models’ do. Your thinking would presumably be less efficient, you’d still be worse off compared to ordinary humans, but it would be way way better than it was at the beginning before you had practice. After practice, you’d basically just be verbalizing your inner monologue constantly, putting it into the text stream, and then simultaneously reading it and also paying attention to what’s happening in the world around you.
I think that in theory there is nothing wrong with having your memory wiped every iteration, and that such an architecture could in theory get us to SC. I just think it’s not very efficient and there would be a lot of repeated computation happening between predicting each word.
I mean yeah, totally agree re: repeated computation and inefficiency. But there’s no rule that says the first SC has to be close to the limits of efficiency. On the contrary, just as how the first viable airplanes were extremely shitty compared to the airplanes of today, the first viable SC will probably be shitty in various ways (e.g. data-efficiency) and perhaps this’ll be one of those ways.
I think even in the case that AI 2027 is directionally correct (very fast AI progress) the concrete details are likely to be wrong, so I’m not sure how impressed one should be if your predictions turn out to be correct.
About “it’s all just vibes”: AI 2027 is strongly based on the METR time horizon analysis. I think it would be more fruitful to critique and analyse that. Stuff like the time from SC to SAI seems like epicycles. Though the biggest uncertainty in AI 2027 probably comes from the assumption of recursive improvement.
I am not sure how fruitful the “shallow vs deep thinking” terminology is. What you explain in more detail is what I call “knowledge integration” and “learning while problem solving” which is both about humans having more powerful representations that can be modified while mulling stuff over and improved by integrating data from other domains.
Your algorithmic explanation for LLM shortcomings seems to be wrong and based on a misunderstanding of how LLMs work:
As joseph_c already mentioned the human brain (as an nn architecture) is much, much wider and shallower than a GPT. One of your examples, coming up with clever jokes, also doesn’t require enough time for humans to engage in a lot of recursive thought.
Also, LLMs do actually keep the entire earlier state around, that’s what the KV-cache is. The computation of each new token does access the fine-grained vector representation of earlier tokens. There is no memory wiping going on.
I think the opposite is correct: LLMs are not nearly wide enough. As a consequence their representation of the “the problem” or “the situation” is impoverished.
I am predicting a world that looks fantastically different from the world predicted by AI 2027. It’s the difference between apocalypse and things basically being the same as they are now. The difference between the two is clear.
I agree that having internal representations that can be modified while reasoning is something that enables deep thinking, and I think this is something LLMs are bad at. Because of the wideness/depth issue and the lack of recurrence.
I only have a lay understanding of how LLMs work, so forgive me if I’m wrong about the specifics. It seems to me the KV cache is just an optimization. Either way, the LLM’s output is deterministic on the input tokens, and information is not being lost. What I was pointing to was the fact that the feed forward networks for the new token don’t have access to the past feed-forward states of the other tokens, so they can’t see e.g. what reasoning paths were dead ends, unless information about those dead ends made it into the output. This is a toy example, but I’m imagining a far-future LLM with enough understanding of biology and chemistry baked into its (for some reason very huge wide/deep) feed-forward networks to cure cancer in a single layer (for some layer). Imagine in one run, the input is “the cure for cancer”. Imagine the attention dimension is very narrow. In one layer, the feed-forward network may cure cancer in this run, among doing many other things, and then possibly discard that information when going to the next layer. In a subsequent run on the input “the cure for cancer is”, it may cure cancer again, and this time include some detail of that cure in its output to the next layer, since now it’s more likely to be relevant to predicting the next token. When curing cancer the second time, it didn’t have access to any of the processing from the first time. Only what previous layers outputted for previous tokens. Does that sound right? If so, the fact that the LLM is strictly divided into layers with feed-forward parts being wider than other parts is a limitation on deep thinking. Obviously the example is an exaggeration because a feed-forward layer wouldn’t be curing cancer on its own, but it speaks to the fact that even though information isn’t being lost, computation is segregated in a way that some processing done in previous runs isn’t available to future runs.
I already responded to what joseph_c said about the human brain, but I’ll go into a bit more detail here. Progressing 200 steps forward in a feed-forward neural network is not nearly as “deep” as progressing 200 neurons in any direction in a recurrent network, and either way a 200 neuron chain of processing is not a lot. I suspect when doing deep thinking, the depth of neural firings in humans would be much greater, over a longer period of time. I think brains are deeper than LLMs, and only wider in the sense that they’re currently larger overall.
Coming up with new clever jokes does take a lot of time for humans actually. Stand-up comedians spend hours writing every day to write one hour of clever jokes total per year. When people come up with jokes that are funny in conversation, that is the product of one of three things:
The joke isn’t particularly clever, but people are in the mood to laugh
The joke is clever, and you got lucky
The joke is funny because you’re a funny person who already has a bunch of “joke formats” memorized, which makes telling funny jokes on the fly easier. But even then, it’s not fully shallow, and you can’t do it reliably. It just makes it easier.
I’m not sure, but I think you possibly could make an LLM that is so extremely wide that it could cure cancer, be superintelligent, etc. But I think actually training/running that network would be so exorbitantly expensive that you shouldn’t bother (for the reasons I pointed to in my post), and that’s why LLMs will plateau compared to less limited architectures.
That is the misconception. I’ll try to explain it in my words (because frankly despite knowing how a transformer works, I can’t understand Radford Neal’s explanation).
In the GPT architecture each token starts out as an embedding, which is then in each layer enriched with information from previous tokens and knowledge stored in the nn itself. So you have a vector which is modified in each layer, let’s call the output of the n-th layer: vn
The computation of vn accesses the vn−1 of all previous tokens! So in your example, if in layer n−1 at some token the cure for cancer is discovered, all following tokens will have access to that information in layer n. The model cannot forget this information. It might never access it again, but the information will always be there for the taking.
This is in contrast to a recurrent neural network that might actually forget important information if it is unfortunate in editing its state.
I believe I understood Radford Neal’s explanation and I understand yours, as best I can tell, and I don’t think it so far contradicts my model of how LLMs work.
I am aware that the computation of vn has access to vn−1 of all previous tokens. But vn−1 are just the outputs of the feed-forward networks of the previous layer. Imagine a case where the output was 1000 times smaller than the widest part of the feed-forward network. In that case, most of the information in the feed-forward network would be “lost” (unavailable to vn).
Of course, you assume if the model is well-trained, the most pertinent information to predicting the next token would make it into the output. But “the most pertinent information” and “all the information” are two different things, and some information might seem more relevant than now that the new token’s appeared, leading to duplicate work or even cases where previous run happened to understand something the subsequent run did not.
As Radford Neal also mentioned, the fact that the model may/may not properly use information from previous states is another possible issue.
This is all pretty complicated so hopefully what I’m saying is clear.
The function of the feedforward components in transformers is mostly to store knowledge and to enrich the token vectors with that knowledge. The wider you make the ff-network the more knowledge you can store. The network is trained to put the relevant knowledge from the wide hidden layer into the output (i.e. into the token stream).
I fail to see the problem in the fact that the hidden activation is not accessible to future tokens. The ff-nn is just a component to store and inject knowledge. It is wide because it has to store a lot of knowledge, not because the hidden activation has to be wide. The full content of the hidden activation in isolation just is not that relevant.
Case in point: Nowadays the ff-nns actually look different than in GPT-3. They have two hidden layers with one acting as a gating mechanism: The design has changed to allow the possibility to actively erase part of the hidden state!
Also: This seems very different from what you are talking about in the post, it has nothing to do with “the next run”. The hidden layer activations aren’t even “accessible” in the same run! They are purely internal “gears” of a subcomponent.
It also seems to me like you have retreated from
to “intermediate activations of ff-components are not accessible in subsequent layers and because these are wider than the output not all information therein contained can make it into the output”.
I’ll admit I am not confident about the nitty-gritty details of how LLMs work. My two core points (that LLMs are too wide vs. deep, and that LLMs are not recurrent and process in fixed layers) don’t hinge on the “working memory” problems LLMs have. But I still think that seems to be true, based on my understanding. For LLMs, compute is separate from data, so the neural networks have to be recomputed each run, with the new token added. Some of their inputs may be cached, but that’s just a performance optimization.
Imagine an LLM is processing some text. At layer n, the feed-forward network has (somehow, and as the first layer that has done so) decided the feature definitely relates to hostility, and maybe relates to politics, but isn’t really sure, so let’s say the part about politics doesn’t really make it into the output for that layer, because there’s more important information to encode (it thinks). Then in the next run, the token “Trump” is added to the input. At layer n, the feed-forward network has to decide from scratch this token is related to politics. Nothing about the previous “this seems kinda political, not sure” decision is stored in the LLM, even though it was in actuality computed. In an alternative architecture, maybe the “brain area” associated with politics would be slightly active already, then the token “Trump” comes in, and now it’s even more active.
it’s all there for layer n+1′s attention to process, though. at each new token position added to the end, we get to use the most recent token as the marginal new computation result produced by the previous token position’s forward pass. for a token position t, for each layer n, n cannot read the output of layer n at earlier token i<t, but n+1 can read everything that happened anywhere in the past, and that gathering process is used to refine the meaning of the current token into a new vector. so, you can’t have hidden state build up in the same way, and each token position runs a partially-shared algorithm. but you can have unhidden state build up, and that unhidden state gets you full turing completeness.
(“brain area” equivalent would be “feature subspace” afaik. which is actually a slightly more general concept that also covers when a human brain lights up in ways that aren’t regionally localized)
Does this not mean the following though?
In layer n, the feed-forward network for token position t will potentially waste time doing things already done in layer n during tokens i<t.
This puts a constraint on the ability of different layers to represent different levels of abstraction, because now both layer n and n+1 need to be able to detect whether something “seems political”, not just layer n.
This means the network needs to be deeper when we have more tokens, because token t needs to wait until layer n+1 to see if token t-1 had the feature “seems political”, and token t+1 needs to wait until layer n+2 to see if token t had the feature “seems political”, and so on.
″...feed forward networks for the new token don’t have access to the past feed-forward states of the other tokens...”
This isn’t correct. The attention mechanism can move information from the neural network outputs at previous times to the current time, that is then fed into the feedforward network for the current time. The basic transformer mechanism is to alternate cross-time attention computations with within-current-time neural network computations, over many layers. Without access to information from past times, performance would obviously be atrocious.
In a sense, the KV cache that retains this information from past times is “just” an optimization, because the computations are (in theory, not always in practice) deterministic, so one could just redo them again for every previous token when predicting the next token (assuming the previously-generated tokens are retained). But that doesn’t seem enough to support your argument.
Of course, it’s quite possible that the models don’t attend very well to the past states, and so suffer to some extent from the issues you mention, but it’s not a fundamental property of the architecture.
Again, I could be misunderstanding, but it seems like only outputs of the neural networks are being stored and made available here, not the entire neural network state.
This was the purpose of my cancer-curing hypothetical. Any conclusions made by the feed-forward network that don’t make it into the output are lost. And the output is narrower than the widest part of the feed-forward network, so some information is “lost”/unavailable to subsequent tokens.
Models not attending very well to past states could be an additional factor worth considering, but I’m not sure if that is or isn’t true.
OK, I think I more clearly see what you’re saying. The hidden unit values in a feedforward block of the transformer at a previous time aren’t directly available at the current time—only the inputs of that feedforward block can be seen. But the hidden unit values are deterministic functions of the inputs, so no information is lost. If these feedforward blocks were very deep, with many layers of hidden units, then keeping those hidden unit values directly available at later times might be important. But actually these feedforward blocks are not deep (even though the full network with many such blocks is deep), so it may not be a big issue—the computations can be redundantly replicated if it helps.
I’m not really talking about true information loss, more like the computation getting repeated that doesn’t need to be.
And yes the feedforward blocks can be like 1 or 2 layers deep, so I am open to this being either a small or a big issue, depending on the exact architecture.
I want to register that I’m happy people are putting alternative, less rapid forecasts out there publicly, especially when they go against prevailing forum sentiments. I think this is a good thing :)
Thank you!
Ha, nice!
I don’t find this argument compelling, because the human brain is much wider and possibly shallower than GPT-3. Humans have a conscious reaction time of about 200 milliseconds, while neurons take about 1ms to influence their neighbors, meaning an upper bound on the depth of a conscious reaction is 200 neurons.
I expect humans are not doing deep thinking in a 200 ms conscious reaction.
Benchmarks are best-case analysis of model capabilities. A lot of companies benchmark max, but is this inherently bad? If the process is economically valuable and repetitive, I don’t care how the LLM gets it done even if it is memorizing the steps.
I think the benchmarks give a misleading impression of the capabilities of AI. It makes it seem like they’re on the verge of being as smart as humans. It makes it sound like they’re ready to take on a bunch of economically valuable activity that they’re not, leading to the issues currently happening with bosses making their employees use LLMs, for example.
I strongly suspect that wargames were involved in a different part of the forecast, when one tried to find out what would happen once the superhuman coders were invented and stolen. Then both[1] sides of the race would integrate the coders in order to make AI research as fast as possible. Next the sides would race hard, ignoring the need to ensure that the AIs are actually aligned. This ignorance would lead to the AIs becoming adversarially misaligned. While the scenario assumes that adversarial misalignment gets discovered, it might also fail to get discovered, in which case the leading company would race all the way to a fiasco.
Expert feedback is the only potential source of estimates related to takeoff speeds after superhuman coders. Trend extrapolation was the method for creating the timelines forecast which is a little better founded in reality than the takeoff forecast, but contained mistakes coupled with the fact that the actual time horizon trend is likely to experience a slowdown instead of being superexponential.
My understanding of timeline-related details
The superexponential trend was likely an illusion caused by the accelerated trend between GPT-4o and o1 (see METR’s paper, page 17, figure 11). While o3, released on April 16, continued the trend, it cannot be said about Grok 4 or GPT-5, released in July and August of 2025. In addition, the failure of Grok 4, unlike GPT-5, COULD have been explained away by xAI’s incompetence.
Kokotajlo already claims to have begun working on AI-2032 branch where the timelines are pushed back, or that “we should have some credence on new breakthroughs e.g. neuralese, online learning, whatever. Maybe like 8%/yr?[2] Of a breakthrough that would lead to superhuman coders within a year or two, after being appropriately scaled up and tinkered with.”
Here I talk about two sides because OpenBrain and other American companies have the option to unite their efforts or the USG can unite the companies by force.
I don’t understand why Kokotajlo chose 8%/yr as the estimate. We don’t know how easy it will be to integrate neuralese into LLMs. In addition, there is Knight Lee’s proposal, my proposal to augment the models with a neuralese page selector which keeps the model semi-interpretable since the neuralese part guides the attention of the CoT-generating part into important places. Oh, and stuff like diffusion models which was actually tried by Google.
I agree with where you believe the wargames were used.
I think trend extrapolation from previous progress is a very unreliable way to predict progress. I would put more stock into a compelling argument for why progress will be fast/slow, like the one I hope I have provided. But even this is pretty low-confidence compared to actual proof, which nobody has.
In this case, I don’t buy extrapolating from past LLM advances because my model is compatible with fast progress up to a point followed by a slowdown, and the competing model isn’t right just because it looks like a straight line when you plot it on a graph.
When we speak about very near catastrophes, reverse Doomsday argument is in play: I am unlikely to be in a position just before the catastrophe. If you think you are dying or that ASI is tomorrow, it is reasonable to be skeptical about it.
I think this is true, but like the Lindy effect, is a very weak form of evidence that is basically immediately ignorable in light of any stronger evidence gained by actually examining object-level reality
It can become very strong for shorter time predictions. If I say that the end of the world is tomorrow, it has very small a priori probability and very large update is needed to override it.
To be completely honest, I think the best argument against AI 2027′s scenario is that it relies on the assumption that we will soon be in a super-exponential progress regime, and we don’t have much evidence that we are on a super-exponential trajectory soon, and we have reason to believe the data points that vindicate super-exponential trajectories are fundamentally temporary and non-extrapolatable.
We don’t really need any more detailed argument than that, and we shouldn’t go too much into details here, because of the fact that detailed stories must become either equally or less probable for every detail added.
Edit: I will likely respond to comments slowly, if at all due to rate limits.
I am not adding more detail to my prediction, I’m adding more detail to my justification of that prediction, which doesn’t make my prediction less probable. Unless you think predictions formed on the basis of little information are somehow more robust than predictions formed based on lots of information.
As for denying the super-exponential trend, I agree. I don’t put a lot of stock in extrapolating from past progress at all, because breakthroughs are discontinuous. That’s why I think it’s valuable to actually discuss the nature of the problem, rather than treating the problem as a black box we can predict by extrapolation.
Back in 23-24 when I would ask friends at OpenAI and anthropic and deepmind about how far away this sort of architecture was from being used in flagship models, they would generally say “a few years away.” Hence the prediction for AI 2027. To be clear, this isn’t exactly a large sample size scientific survey (it is neither large nor scientific). I definitely don’t feel confident that it’ll be in 2027 specifically. But I’d be curious to hear counterarguments for why we should be fairly confident it’s more than a decade away, for example.
I’m not confident neuralese is more than a decade away. That could happen by 2027 and I wouldn’t be shocked. I don’t think it’ll be a magic bullet though. I expect less of an impedance mismatch between neuralese and the model than language and the model, but reducing that impedance mismatch is the only problem being solved by neuralese.
(1) They can do reasoning, i.e. use their Chain of Thought to make intellectual progress that they can’t make within a single forward pass. This seems probably sufficient to me, given sufficient training to make good use of it. If not though, well, new architectures with recurrence/neuralese/etc. are being worked on by various groups and might start being competitive in the next few years. And if you are correct that this is the bottleneck to deep thinking, then soon the companies will realize this and invest a lot more in scaling up giant recurrent models or whatever. All this feels like it’ll be happening in the next decade to me, whereas you feel like it’s more than a decade away?
I went into more detail about why I think this is more than 10 years away in a follow-up blog post:
https://www.lesswrong.com/posts/F7Cdzn5mLrJvKkq3L/shallow-vs-deep-thinking-why-llms-fall-short
To be clear, I think that basically any architecture is technically sufficient if you scale it up enough. Take ChatGPT, make it enormous, through oceans of data at it, and then allow it to store gigabytes of linguistic information. This is eventually a recipe for superintelligent AI if you scale it up enough. My intuition so far is that we basically haven’t made any progress when it comes to deep thinking though, and as soon as LLMs start to deviate from learned patterns/heuristics, they hit a wall and become as smart as a ~shrimp. So I estimate the scale required to actually get anywhere with the current architecture is just too high.
I think new architectures are needed. I don’t think it will be as straightforward as “just use recurrence/neuralese”, though moving away from the limitations of LLMs will be a necessary step. I think I’m going to write a follow-up blog post clarifying some of the limitations of the current architecture and why I think the problem is really hard, not just a straightforward matter of scaling up. I think it’ll look something like:
Each deep problem is its own exponential space, and exploring exponential spaces is very computationally expensive. We don’t do that when running LLMs for a single-pass. We barely do it when running with chain of thought or whatever. We only do it when training, and training is computationally very expensive, because exploring exponential spaces is very computationally expensive. We should expect an AI that can generically solve deep problems will be very computationally expensive to run, let alone train. There isn’t a cheap, general-purpose strategy for solving exponential problems, so you can’t re-use progress from one to help with another necessarily. An AI that solves a new exponential problem will have to do the same kind of deep thinking AlphaGo Zero did in training when it played many games against itself, learning patterns and heuristics in the process. And that was a best-case, because you can simulate games of Go, but most problems we want to solve are not simulate-able, so you have to explore exponential space in a much slower, more expensive way.
And btw I think LLMs currently mostly leverage existing insights/heuristics present in the training data. I don’t think it’s bringing much insight on it’s own right now, even during training. But that’s just my gut feel.
I think we can eventually make the breakthroughs necessary and get to the scale necessary for this to work, but I don’t see it happening in five years or whatever.
Glad to see some common sense/transparency about uncertainty. It seems to me that AGI/ASI is basically a black swan event — by definition unpredictable. Trying to predict it is a fool’s errand, it makes more sense to manage its possibility instead.
It’s particularly depressing when people who pride themselves in being rationalists basically ground their reasoning on “line has been going up, therefore it will keep going up”, as if Moore’s law mere existence means it extends to any and all technology-related lines in existence[1]. It’s even more depressing when those “line go up” come from very flawed/contaminated benchmarks (like SWE-bench), or very skewed (like the 50% success aspect of the METR long tasks benchmark, which imo is absolutely crucial for differentiating an autonomous agent v/s a supervised copilot).
Hopefully I’ll be able to mirror your sipping eggnog and gloating in Christmastime 2027.
[1] “Hume, I felt, was perfectly right in pointing out that induction cannot be logically justified.” (Popper)
There’s a high bar to clear here: LLM capabilities have so far progressed at a hyper-exponential rate with no signs of a slowdown [1].
7-month doubling time (early models)
5.7-month doubling time (post-GPT-3.5)
4.2-month doubling time (post-o1)
So, an argument for the claim that we’re about to plateau has to be more convincing than induction from this strong pattern we’ve observed since at least the release of GPT-2 in February 2019.
Your argument does not pass this high bar. You have made the same kind of argument that has been made again and again (which have been proven wrong again and again) throughout the past seven years we have been scaling up GPTs.
One can’t simply point out the ways in which the things that LLMs cannot currently do are hard in a way in which the things that LLMs currently can do are not. Of course, the things they cannot do are different from the things they can. This has also been true of the capability gains we have observed so far, so it cannot be used as evidence that this observed pattern is unlikely to continue.
So, you would need to go further. You would need to demonstrate that they’re different in a way that meaningfully departs from how past, successfully gained capabilities differed from earlier ones.
To make this more concrete, claims based on supposed architectural limitations are not an exception to this rule: many such claims have been made in the past and proven incorrect. The base rate here is unfavourable to the pessimist.
Even solid proofs of fundamental limitations are not by their nature sufficient: these tend to be arguments that LLMs cannot do X by means Y, rather than arguments that LLMs cannot do X.
To be convincing, you have to make an argument that fundamentally differentiates your objection from past failed objections.
[1] based on METR’s research https://metr.org/blog/2025-03-19-measuring-ai-ability-to-complete-long-tasks/
I explained in my post that I believe the benchmarks are mainly measuring shallow thinking. The benchmarks include things like completing a single word of code or solving arithmetic problems. These unambiguously fall within what I described as shallow thinking. They measure existing judgement/knowledge, not the ability to form new insights.
Deep thinking has not progressed hyper-exponentially. LLMs are essentially shrimp-level when it comes to deep thinking, in my opinion. LLMs still make extremely basic mistakes that a human 5-year-old would never make. This is undeniable if you actually use them for solving problems.
The distinction between deep and shallow thinking is real and fundamental. Deep thinking is non-polynomial in its time complexity. I’m not moving the goalposts to include only whatever LLMs happen to be bad at right now. They have always been bad at deep thinking, and continue to be. All the gains measured by the benchmarks are gains in shallow thinking.
I believe I have done so, by claiming deep thinking is of a fundamentally different nature than shallow thinking, and denying any significant progress has been made on this front.
If you disagree, fine. Like I said, I can’t prove anything, I’m just putting forward a hypothesis. But you don’t get to say I’ve been proven wrong. If you want to come up with some way of measuring deep thinking and prove LLMs are or are not good at it, go ahead. Until that work has been done, I haven’t been proven wrong, and we can’t say either way.
(Certain things are easy to measure/benchmark, and these things tend to also require only shallow thinking. Things that require deep thinking are hard to measure for the same reason they require deep thinking, and so they don’t make it into benchmarks. The only way I know how to measure deep thinking is personal judgement, which obviously isn’t convincing. But the fact this work is hard to do doesn’t mean we just conclude that I’m wrong and you’re right.)
Except that Grok 4 and GPT-5 arguably already didn’t adhere to the faster doubling time. And I say “arguably” because of Grok failing some primitive tasks and Greenblatt’s pre-release prediction of GPT-5′s time horizon. While METR technically didn’t confirm the prediction, METR itself acknowledged that it ran into problems when trying to calculate GPT-5′s time horizon.
Another thing to consider is that Grok 4′s SOTA performance was achieved by using similar amounts of compute for pretraining and RL. What is Musk going to do to ensure that Grok 5 is AGI? Use some advanced architecture like neuralese?
EDIT: you mention a 5.7-month doubling time post-GPT-3.5. But there actually was a plateau or slowdown between GPT-4 and GPT-4o which was followed by the GPT4o-o3 accelerated trend.
I don’t think there was a plateau. Is there a reason you’re ignoring Claude models?
Greenblatt’s predictions don’t seem pertinent.
Look at the METR graph more carefully. The Claudes which METR evaluated were released during the age which I called the GPT4o-o3 accelerated trend (except for Claude 3 Opus, but it wasn’t SOTA even in comparison with the GPT4-GPT4o trend).
With pre-RLVR models we went from a 36 second 50% time horizon to a 29 minute horizon.
Between GPT-4 and Claude-3.5 Sonnet (new) we went from 5 minutes to 29 minutes.
I’ve looked carefully at the graph, but I saw no signs of a plateau nor even a slowdown.
I’ll do some calculation to ensure I’m not missing anything obvious or deceiving myself.
I don’t any sign of a plateau here. Things were a little behind-trend right after GPT-4, but of course there will be short behind-trend periods just as there will be short above-trend periods, even assuming the trend is projectable.
I’m not sure why you are starting from GPT-4 and ending at GPT-4o. Starting with GPT-3.5, and ending with Claude 3.5 (new) seems more reasonable since these were all post-RLHF, non-reasoning models.
AFAIK the Claude-3.5 models were not trained based on data from reasoning models?
I tweeted about why I think AI isn’t creative a few days ago. It seems like we have similar thoughts. A good idea comes from noticing a potential connection between ideas and recursively filling in the gaps through verbalizing/interacting with others. The compute for that right now is unjustified.
It seems like basically everything in this is already true today. Not sure what you’re predicting here.
I mean of course it’s true today, right? It would be weird to make a prediction “AI can’t do XX in the future” (and that’s most of the predictions here) if that isn’t true today.
I just don’t think there is much to this prediction.
It takes a set of specific predictions, says none of it will happen, and by the nature of the conjunctive prediction, most will not happen. It would be more interesting to hear how AI will and will not progress rather than just denying an already unlikely to be perfect prediction.
Inevitably they’ll be wrong on some of these, but they’ll look more right on the surface level because they will be right on most of them.
If you think I’ll be right on most of these, then I think you disagree with the AI 2027 predictions.
Most of my predictions are simply contradictions of the AI 2027 predictions, which are a well-regarded series of predictions for AI progress by the end of 2027. I am stating that I disagree and why.
Thanks for writing this up! I also want to register that I agree with all of this, maybe except for the part where AIs can’t tell novel funny jokes—I expect this to be relatively easy. But of coursre it depends on the definition of ‘novel’.
I struggled to do this exercise myself because when I looked at AI as a normal technology I felt like I basically agree with most of their thinking, but it was also hard to find concrete differences between their predictions and AI2027 at least in the near term. For example, for things like “LLMs are broadly acknowledged to be plateauing”, it’s probably going to be concurrently both true and false in a way that’s hard to resolve—a lot of people may complain that it’s plateauing but the benchmark scores and the usage stats could show otherwise.
It’s funny everyone is doubting the funny jokes part. I view funny jokes as computationally hard to generate, probably because I’ve sat down and actually tried, and it doesn’t seem fundamentally easier than coming up with brilliant essay ideas or whatever. But most people just have experience telling jokes in the moment, which is a different kind of non-deep activity. Maybe AI will be better at that, but not so good at e.g. writing an hour of stand-up comedy material that’s truly brilliant?
Yes, this is somewhat ambiguous I admit. I’m kind of fine with that though. I’m not placing any bets, I’m just trying to record what I think is going to happen, and the uncertainty in the wording reflects my own uncertainty of what I think is going to happen.
I’ll register my prediction here as well. I largely agree with your projection, although my median case looks a little bit more advanced. Also, note that I am not vouching for your arguments.
75% - likely we live in a world that feels pretty normal. That is, similar to what you described or a bit more advanced, as mentioned.
Here are some places I differ from your predictions which might give insight into what I mean by “a little bit more advanced”:
- In general, I anticipate more progress, both in terms of tech, and its integration into our world.
- AI might have transformed some major industries and careers, even without providing novel research or human level insights. It’s still not enough to cause an unprecedented crisis or anything. It’s still in the range of historical economic transitions.
- It’s also possible that AI has come up with some valuable scientific insights, just not often enough to be considered TAI or to completely disrupt the world/economy/society.
- AI might be able to replace more coders than you’ve described, as well as other knowledge workers.
- AI will be able to tell genuinely funny jokes.
- Self-driving cars of the type you’ve described are possible, although I think 2029 would be a safer bet.
- There will be real advances, but overall Christmas 2027 will still feel like Christmas 2024. My grandparents (who have never used a smart phone or a laptop) won’t have noticed at all.
~ 8% or less on us living in a world like AI2027, or one with advances at least as fast and transformative. Foom lives here.
~ 8% goes to different ‘weird’ futures. For instance, what if robotics absolutely explodes, and we start seeing robots everywhere, but AI itself is still pretty bland? Or what if specialist systems take over the economy, but you still can’t really have a conversation with an AI that doesn’t fall apart quickly. Or there is a completely new paradigm that is more generally smart than LLMs, but is slow and lacks knowledge. Or there is AGI, but it is extremely expensive, or it’s sealed in a lab. Or etc. etc. etc. This category includes industrial revolution magnitude changes that aren’t just ‘LLMs get better and we have to deal with a new, often superior intelligence’. It also includes major advances in AI that don’t cause grand transformations. Eh, it’s kind of odd to lump these together I suppose. But the point of this category was to be a catch all for unpredictable scifi scenarios I guess.
~ 8% goes on a complete AI bust, where it’s generally accepted that it was a mistake to invest so much in AI and to integrate it into our economy. An AI winter is imminent and not very controversial. Undramatic AI plateaus do NOT live here.
This is all based on not having any major disruptions to the world. For instance, I’m not considering the implications of a global war, or another pandemic.
I should also note that while this puts my odds of 2027 Foom and Doom in the single digits or lower… that’s still an awful high figure for the end of all humanity. Flip a coin 7-9 times. If it’s head each time, then every one of us will be dead in 3 years.
I definitely agree this will happen a lot sooner than superhuman coders. Growth of shallow reasoning ability is not enough to cure cancer and bring about the singularity, but some things will get weird. Some careers will probably vanish outright. I often describe some careers as “requiring general intelligence”, and by that I mean requiring deep thinking. For example, writing fiction or doing AI research. In a sense, when any one of these falls to AI, they all fall. Until then, it’ll only be the shallow jobs (transcription, for example) that can fall.
Agreed
It might be the case that LLMs develop different cognitive strategies to cope with this, such as storing the working memory on the CoT tokens, so that the ephemeral intermediate steps aren’t load-bearing. The effect would be that the LLM+CoT system acts as… whatever part of our brain explores ideas.
I didn’t go into as much detail about this in my post as I planned to.
I think relying on chain of thought for coping with the working memory problem isn’t a great solution. The chain of thought is linguistic, and thereby linear/shallow compared to “neuralese”. A “neuralese” chain of thought (non-linguistic information) would be better, but then we’re still relying on an external working memory at every step, which is a problem if the working memory is smaller than the model itself. And potentially an issue even if the working memory is huge, because you’d have to make sure each layer in the LLM has access to what it needs from the working memory etc.
Could an LLM write a show like Seinfeld? This might actually be my test for whether I accept that LLMs are truly clever. Anyone who’s watched it knows that Seinfeld was great because of two reasons: (1) Seinfeld broke all the rules of a sitcom. (2) The show follows very relatable interactions on the most non-trivial issues between people and runs it with it for seasons. There is no persistent plot or character arc. You have humans playing themselves. And yet it works.
I am pretty sure current LLMs could not write any competitive TV scripts.
Except that JustisMills on June 17 made a post titled “Ok, AI Can Write Pretty Good Fiction Now”.
I remember when ChatGPT came out, people were very impressed with how well it could write poetry. Except the poetry was garbage. They just couldn’t tell, because they lacked the taste in poetry to know any better. I think the same thing still applies to fiction/prose generated by ChatGPT. It’s still not good, but some people can’t tell.
To be clear about my predictions, I think “okay”/”acceptable” writing (fiction and nonfiction) will become easier for AI to generate in the next 2 years, but “brilliant”/”clever” will not really.
Yep, these conclusions intersect with the prognosis I made for myself at the end of 2024:
- There will be no technological singularity.
- Neural networks won’t change conceptually over the next 10 years.
- We won’t build strong AI based on just one or a few neural networks.
- Neural networks on their own won’t take jobs away.
- Robots will not rapidly and massively replace manual labor.
- Works of proactive professionals are safe for the next 10 years.
My predictions are based on:
- The view on the current neural networks as on one-more-building-block-from-many, essentially a probabilistic database — not more but not less.
- The assumption that businesses/governments/money will focus on optimizing the current achievements, rather than on continuing risky experiments.
A much longer post with explanation is in my blog: https://tiendil.org/en/posts/ai-notes-2024-prognosis
I agree with you that the types of neural networks currently being used at scale are not sufficient for artificial superintelligence (unless perhaps scaled to an absurd level). I am not as confident that businesses won’t continue investing in risky experiments. For example, experiments into AI that does not “separate training from their operational mode”, or experiments into recurrent architectures, are currently being done.
I definitely don’t agree with your claim in the blog post that even if strong AI comes, we will all simply adapt. Your arguments about more mature people finding common ground with less mature people ignores the fact that these people either belong to the same family, or the same legal system. A strong AI will not necessarily love you or care about following the law. In cases where humans don’t have those constraints, they tend not to always be so nice to one another. I think AI risk is an existential threat, if superintelligent AI does show up.
My statement about the lack of huge investments in risky experiments may really be too strong. In the end, we speak about people, and they are always unpredictable. Partially, I formulated it that way to acquire a strong validation point for multiple of my personal models of how the world and society work. However, I still believe it is more probable than the opposite statement.
Speaking about strong AI.
The analogy between child-parent relations is the simplest I found for the post. The history of humanity has a long record of communication between different societies on different levels of maturity and with different cultures, of course. Those contacts didn’t always go well, but they also didn’t lead to the full extinction of one of the parties (in most cases).
Since most likely a superintelligent will be produced on the basis of the information humanity produced (we don’t have any other source of data), I believe it will operate in a similar way ⇒ a struggle is possible, maybe a serious one even, but in the end we will adapt to each other.
However, this logic is relevant in situations where a strong AI appears instantly.
I don’t expect that, given there are no historical precedents for something complex appearing instantly without a long process of increasing complexity.
What I expect, assuming that strong AI will appear, is a long process of evolving AI tools with increasing complexity to each of which humanity will adapt, like it already adapted to LLMs. At some point, those tools began uniting into a kind of smaller AIs, and we’ll adapt to them, and they will adapt to us. And so on, until we reach a point when the complexity of those tools will be incomparably higher than the complexity of humanity. But by that time, we will have already adapted to them.
I.e., if such a thing ever happens, it will be a long enough co-evolution, rather than an instant rise of a superintelligent being and obliteration of humanity.