The shape of AGI: Cartoons and back of envelope

​​


[Cross posted on windowsontheory; see here for my prior writings]

There have been several studies to estimate the timelines for artificial general intelligence (aka AGI). Ajeya Cotra wrote a report in 2020 (see also 2022 update) forecasting AGI based on comparisons with “biological anchors.” That is, comparing the total number of FLOPs invested in a training run or inference with estimates on the total number of FLOPs and bits of information that humans/​evolution required to achieve intelligence.[1] In a recent detailed report, Davidson estimated timelines via a simulation (which you can play with here) involving 16 main parameters and 51 additional ones. Jacob Steinhardt forecasted how will large language models look in 2030.

In this blog post, I will not focus on the timelines or risks but rather on the shape we could expect AI systems to take and their economic impact. I will use no fancy models or math and keep everything on the level of cartoons or back-of-envelope calculations. My main contention is that even post-AGI, AI systems will be incomparable to humans and stay this way for an extended period of time, which may well be longer than the time to achieve AGI in the first place.[2]

The cartoon below illustrates the basic argument for why we expect AGI at some point. Namely, since capabilities grow with log compute, and compute (by Moore’s law) grows exponentially with time; eventually, AI systems should reach arbitrarily high capabilities.

Figure: Left: Cartoon of capabilities of AI systems as a function of computational resources. To a first approximation, historical improvement in capabilities scaled with the logarithm of commute invested. There are scenarios of either “hitting a wall” (e.g., running out of data) or “self improvement” (once we get near AGI, we can start using AIs to design better AIs) that could break this line in one direction or the other. Still, the baseline scenario is a straight line. Right: Cartoon of computational resources available per “moonshot investment” (e.g., hundreds of Billion dollars) as a function of time. We expect the computational resources per dollar to increase exponentially with time, eventually reaching the threshold needed for AGI.

Notes: There are established estimates for the slope of the graph on the right (compute per time). Whether it’s Moore’s law or other extrapolations, we expect an order-of-magnitude improvement per 5-10 years (considering possible speedups in effective compute due to algorithmic progress). In contrast, the quantitative form of the left graph is very uncertain; in particular, we don’t know the amount of compute required for AGI, nor the impact of exhausting data or self-improvement. Regarding the latter, it is possible that intelligence truly requires a large amount of computation, in which case there are hard bounds to the amount of optimization possible, even with the help of AI. Also, I expect that as we pick the “low hanging fruit”, discovering significant optimizations for AI will become more challenging over time, which may be balanced by AI making AI research more efficient. So my baseline assumption is that self-improvement will not change the shape of the graphs but rather increase the slope of the right graph by growing effective compute at a faster rate than that accounted for by hardware improvements on their own.

Why humans and AI will be incomparable

Naively, we might think that just like Chess and Go, once AI reaches reasonable (e.g., novice level) performance, it will quickly transition to completely dominating humans to the extent that the latter would provide no added benefit for any intellectual job. I do not expect this to be the case.

Why not? After all, by Moore’s law, once we get to a model of size X, we should be able to get a model of size 2X for the same price within two years. So, if an X-sized model functions as a novice, wouldn’t a 2X or 4X model achieve super-human performance? I believe the answer is No for two reasons. First, as far as we can tell, the capabilities of models grow with the logarithm of their computational resources. A 2X or even 10X-sized model would have a relatively modest advantage over an X-sized model. Second, unlike Chess or Go, there is no single number that quantifies general intelligence or even the intelligence needed for a particular job. A “50% human performance” model is likely to achieve this level of performance by a mixture of skills, in some of which it has better than human level while in others significantly worse. Increasing the model size will likely have different impacts on different skill levels.

100 times larger is not 100 times better

For the first point, consider GPT4 vs GPT3.5. The former’s training cost is estimated to be 100 times the latter (about vs FLOPs). GPT4 is, without a doubt, better than GPT 3.5. But there is a significant overlap between their capabilities.

Figure: (a) Comparison of GPT3.5 vs. GPT4 on benchmarks: data from Table 2 in GPT4 Technical Report. (b) Comparing GPT3.5 and GPT4 on solving random n linear equations in n variables for n=1,2,3,4. (c) GPT3.5 vs GPT4 performance on exams (Figure 4 in GPT4 report). (d) GPT3.5 (as in ChatGPT) vs. GPT4 adversarial factual accuracy queries (Figure 6 in the report). We can see that the 100x compute GPT-4 model is better than GPT-3.5, but more like a “college-student vs high-school student better” than a “human vs. chimpanzee better.”


It is true that models sometimes exhibit “emerging capabilities” in the sense that performance on a benchmark is no better than chance until resources cross a certain threshold. But even in these cases, the “emergence” after the threshold happens on a logarithmic scale. Moreover, any actual job consists of many tasks, which averages out this growth.

Figure: The right side is the average of the five random sigmoid curves on the left side. If a job involves many capabilities, then even if each capability emerges somewhat rapidly (e.g., within a couple of orders of magnitude), their average “evens out,” so overall performance grows more slowly.

AIs are not “silicon humans.”

AI systems result from a very different process than humans, and even when they achieve similar results, they are likely to do so differently. Large language models are not aimed at simulating a single human but rather at encoding all the knowledge of humanity. They are trained on and retain far more data than any human sees. Because of this, even when, on average, they perform on a level comparable to, for example, a college student, they can still surprise us for the better or worse. For example, very few humans could achieve above median performance in 20 AP exams but be so terrible at Tic-Tac-Toe that they cannot even recognize when they lost. Yet GPT-4 does precisely that.

Figure: Left: GPT-4 loses at Tic-Tac-Toe even when instructed to play optimally and doesn’t even recognize it has lost. Right: Anthropic’s Claude 2 is even worse, declaring a draw and bragging about its AI capabilities after it had lost.


Generally, the performance of a language model ranges widely, from performing truly surprising feats to failing at simple tasks. While some of these “kinks” are likely to be worked out in time, they point to a fundamental difference between models and humans, which is likely to persist. Models are not simply “silicon humans” and won’t have the same mixture of skills.

Figure: Cartoons of two scenarios. A priori, we might expect that as our models improve, their performance is first similar to a student, then a professional, and finally to an expert. Based on However, looking at GPT3.5’s and GPT4’s actual performance, models diverge significantly, performing at very different levels on different tasks, even within the same domain. (The curve and labels are just a cartoon.)

What does this mean?

As we integrate AI systems into our economy, I suspect they will significantly, maybe even radically, improve productivity (see Noy and Zhang). But AI systems will not be drop-in replacements for human workers. In a fascinating piece, Timothy Lee interviews professional translators. This is arguably the field that has been most ripe for AI takeover, given that tons of data are available, the inputs and outputs are text, and decent systems have been around for a while. (Google Translate was launched in 2006 and started using deep learning in 2016.) Yet, while AI may have depressed wage growth, it has not eliminated human translators. As Lee notes, there was “rapid growth in hybrid translation services where a computer produces a first draft, and a human translator checks it for errors.” Also, AI still falls behind in high-stakes cases, such as translating legal documents. This is on par with the history of automation in general; Bessen notes that “it appears that only one of the 270 detailed occupations listed in the 1950 Census was eliminated thanks to automation – elevator operators.” In another piece, Lee points out that while software and the Internet certainly changed the world, they didn’t “eat it” as was predicted and, in particular, haven’t (yet) disrupted healthcare and education.

In general, rather than AI systems replacing humans, I expect that firms will re-organize their workflow to adapt to AI’s skill profile. This adaptation will be easier in some cases than others. Self-driving cars are a case in point. Adoption is much more challenging because the road system is designed for humans, and self-driving cars must coexist with human drivers and pedestrians. Self-driving cars also demonstrate that it can take a long time between the point at which AI can handle 90% or even 99% of the cases until AIs can be trusted to have full autonomy without human supervision. (See another article by Lee on self-driving progress.) Regardless of the domain, adjusting jobs and systems to adapt to AIs will not be easy and will not be done overnight. Thus, I suspect that the path between a demonstration of an AI system that can do 90% of the tasks needed for job X to the actual wide-scale deployment of AI in X will be a long one. That doesn’t mean it will be a smooth transition: a decade or two might look like an eternity in machine learning years, but it is short compared to the 40 years or so that people typically work until retirement.

In fact, many human professions, including police officers, lawyers, civil servants, and political leaders, require judgment or wisdom no less or even more than intelligence. To use AI research terms, these jobs require the worker to be aligned with the values of the broader society or organization. For example, when we vote for public officials, most people (myself included) care much more about their alignment with our values than their intelligence or even competence. For such jobs, solving the alignment problem would be a necessary condition for deploying an AI. In fact, merely solving the alignment problem won’t be enough- we would have to convince the public that it has been solved. Combined with the hypothesis that strategic leadership would not be where AI’s competitive advantage lies, I suspect that AI will not take over those jobs any time soon, and perhaps not at all.


Acknowledgments. Thanks to Jonathan Shafer for telling me about the Tic-Tac-Toe example, and to Ben Edelman for trying it out with Claude 2.

  1. ^

    Some of Cotra’s estimates include FLOPs per second in the human brain, corresponding to in a lifetime, and total FLOPs done over evolution.

  2. ^

    The precise definition of “AGI” doesn’t matter much for this post, but for concreteness, assume this to correspond to the existence of a system that can perform most of the job-related tasks at an above-median level for most current human jobs that can be done remotely; this is similar to the definition I used here. One of the points of this post is that AGI will not be a single event but rather a process of increasingly growing capabilities.