Daniel Kokotajlo comments on Daniel Kokotajlo’s Shortform

Daniel Kokotajlo 27 Feb 2025 21:16 UTC
LW: 93 AF: 43
0
AF
My AGI timelines median is now in 2028 btw, up from the 2027 it’s been at since 2022. Lots of reasons for this but the main one is that I’m convinced by the benchmarks+gaps argument Eli Lifland and Nikola Jurkovic have been developing. (But the reason I’m convinced is probably that my intuitions have been shaped by events like the pretraining slowdown)
What links here?
- Vladimir_Nesov 28 Feb 2025 19:19 UTC
  LW: 23 AF: 8
  8
  AF Parent
  
  my intuitions have been shaped by events like the pretraining slowdown
  
  I don’t see it. GPT-4.5 is much better than the original GPT-4, probably at 15x more compute. But it’s not 100x more compute. And GPT-4o is an intermediate point, so the change from GPT-4o to GPT-4.5 is even smaller, maybe 4x.
  
  I think 3x change in compute has an effect at the level of noise from different reasonable choices in constructing a model, and 100K H100s is only 5x more than 20K H100s of 2023. It’s not a slowdown relative to what it should’ve been. And there are models with 200x more raw compute than went into GPT-4.5 that are probably coming in 2027-2029, much more than the 4x-15x observed since 2022-2023.
  - Daniel Kokotajlo 1 Mar 2025 6:55 UTC
    LW: 10 AF: 4
    0
    AF Parent
    Hmm, let me think step by step. First, the pretraining slowdown isn’t about GPT-4.5 in particular. It’s about the various rumors that the data wall is already being run up against. It’s possible those rumors are unfounded but I’m currently guessing the situation is “Indeed, scaling up pretraining is going to be hard, due to lack of data; scaling up RL (and synthetic data more generally) is the future.” Also, separately, it seems that in terms of usefulness on downstream tasks, GPT 4.5 may not be that much better than smaller models… well, it’s too early to say I guess since they haven’t done all the reasoning/agency posttraining on GPT 4.5 yet it seems.
    
    Idk. Maybe you are right and I should be updating based on the above. I still think the benchmarks+gaps argument works, and also, it’s taking slightly longer to get economically useful agents than I expected (though this could say more about the difficulties of building products and less about the underlying intelligence of the models, after all, RE bench and similar have been progressing faster than I expected)
    - Vladimir_Nesov 1 Mar 2025 15:35 UTC
      LW: 20 AF: 8
      1
      AF Parent
      My point is that a bit of scaling (like 3x) doesn’t matter, even though at the scale of GPT-4.5 or Grok 3 it requires building a $5bn training system, but a lot of scaling (like 2000x up from the original GPT-4) is still the most important thing impacting capabilities that will predictably happen soon. And it’s going to arrive a little bit at a time, so won’t be obviously impactful at any particular step, not doing anything to disrupt the rumors of no longer being important. It’s a rising sea kind of thing (if you have the compute).
      
      Long reasoning traces were always necessary to start working at some point, and s1 paper illustrates that we don’t really have evidence yet that R1-like training creates rather than elicits nontrivial capabilities (things that wouldn’t be possible to transfer in mere 1000 traces). Amodei is suggesting that RL training can be scaled to billions of dollars, but unclear if this assumes that AIs will automate creation of verifiable tasks. If constructing such tasks (or very good reward models) is the bottleneck, this direction of scaling can’t quickly get very far outside specialized domains like chess where a single verifiable task (winning a game) generates endless data.
      
      The quality data wall and flatlining benchmarks (with base model scaling) are about compute multipliers that depend on good data but don’t scale very far. As opposed to scalable multipliers like high sparsity MoE. So I think these recent 4x a year compute multipliers mostly won’t work above 1e27-1e28 FLOPs, which superficially looks bad for scaling of pretraining, but won’t impact the less legible aspects of scaling token prediction (measured in perplexity on non-benchmark data) that are more important for general intelligence. There’s also the hard data wall of literally running out of text data, but being less stringent on data quality and training for multiple epochs (giving up the ephemeral compute multipliers from data quality) should keep it at bay for now.
      What links here?
      Vladimir_Nesov's comment on A Bear Case: My Predictions Regarding AI Progress by Thane Ruthenis (6 Mar 2025 5:31 UTC; 9 points)
    - Trinley Goldenberg 1 Mar 2025 16:10 UTC
      5 points
      0
      Parent
      Hmm, let me think step by step.
      LLMs shaping human’s writing patterns in the wild
    - Josh You 1 Mar 2025 23:40 UTC
      4 points
      0
      Parent
      The high cost and slow speed of GPT-4.5 seems like a sign OpenAI is facing data constraints, though we don’t actually know the parameters and OpenAI might be charging an bigger margin than usual (it’s a “research preview” not a flagship commercial product). If data was more abundant, wouldn’t GPT-4.5 be more overtrained and have fewer parameters?
      edit: FWIW Artificial Analysis measures GPT-4.5 at a not-that-bad 50 tokens per second whereas I’ve been experiencing a painfully slow 10-20 tokens/second in the chat app. So may just be growing pains until they get more inference GPUs online. But OpenAI does call it a “chonky” model, implying significant parameter scaling.
- No77e 27 Feb 2025 21:35 UTC
  18 points
  0
  Parent
  I’m convinced by the benchmarks+gaps argument Eli Lifland and Nikola Jurkovic have been developing
  I’ve tried searching for a bit, but I can’t find the argument. Is it public?
  - mapreader4 28 Feb 2025 0:38 UTC
    16 points
    2
    Parent
    Eli Lifland has a short summary here and says a longer draft is coming.
    - mattmacdermott 28 Feb 2025 17:34 UTC
      16 points
      4
      Parent
      You know you’re feeling the AGI when a compelling answer to “What’s the best argument for very short AI timelines?” lengthens your timelines
      - Daniel Kokotajlo 28 Feb 2025 18:49 UTC
        9 points
        0
        Parent
        yes! :D
        
        Relatedly, one of the things that drove me to have short timelines in the first place was reading the literature and finding the best arguments for long timelines. Especially Ajeya Cotra’s original bio anchors report, which I considered to be the best; I found that when I went through it bit by bit and made various adjustments to the parameters/variables, fixing what seemed to me to be errors, it all added up to an on-balance significantly shorter timeline.
        plex 1 Mar 2025 22:23 UTC
        4 points
        0
        Parent
        I had a similar experience a couple years back when running bio anchors with numbers which seemed more reasonable/less consistently slanted towards longer timelines to me, getting:
        before taking into account AI accelerating AI development, which I expected to bring it a few years earlier.
  - Daniel Kokotajlo 27 Feb 2025 21:49 UTC
    6 points
    0
    Parent
    Not yet sorry we are working on it
- Seth Herd 27 Feb 2025 22:05 UTC
  15 points
  9
  Parent
  I roughly agree with that timeline, although I also emphasize the range over a point estimate; we need to be prepared for shorter as well as longer timelines, since there are so many ways to get faster as well as slower advances than we’d expect.
  One factor in my estimates after EAG is the practical difficulties of scripting for agents. It is reportedly even more frustrating to “debug” a sequence of prompts and model calls that malfunctions stochastically. I do expect automated processes similar to software improving its error-handling and debugging, but all of this will take time.
  I also wonder what your definition of “AGI” is; that’s often a deciding factor in timelines, since people mean so many different things. My definition is “can reason about anything, which requires nontrivial continuous learning”, which I take to be the original definition; that in turn requires some amount of autonomy for self-directed learning. See “Real AGI”. But there are lots of other definitions in play now too, and I don’t remember what yours is.
- Cole Wyeth 28 Feb 2025 2:39 UTC
  LW: 10 AF: 7
  −2
  AF Parent
  It’s wild to me that you’ve concentrated a full 50% of your measure in the next <3 years. What if there are some aspects of intelligence which we don’t know we don’t know about yet? It’s been over ~40 years of progress since the perceptron, how do you know we’re in the last ~10% today?
  - Daniel Kokotajlo 28 Feb 2025 6:25 UTC
    LW: 30 AF: 9
    21
    AF Parent
    Progress over the last 40 years has been not at all linear. I don’t think this “last 10%” thing is the right way to think about it.
    
    The argument you make is tempting, I must admit I feel the pull of it. But I think it proves too much. I think that you will still be able to make that argument when AGI is, in fact, 3 years away. In fact you’ll still be able to make that argument when AGI is 3 months away. I think that if I consistently applied that argument, I’d end up thinking AGI was probably 5+ years away right up until the day AGI was announced.
    
    Here’s another point. I think you are treating AGI as a special case. You wouldn’t apply this argument—this level of skepticism—to mundane technologies. For example, take self-driving cars. I don’t know what your views on self-driving cars are, but if you are like me you look at what Waymo is doing and you think “Yep, it’s working decently well now, and they are scaling up fast, seems plausible that in a few years it’ll be working even better and scaled to every major city. The dream of robotaxis will be a reality, at least in the cities of America.” Or consider SpaceX Starship. I’ve been following its development since, like, 2016, and it seems to me that it really will (probably but not definitely) be fully reusable in four years, even though this will require solving currently unsolved and unknown engineering problems. And I suspect that if I told you these predictions about Waymo and SpaceX, you’d nod along and say maybe you disagree a bit but you wouldn’t give this high-level argument about unknown unknowns and crossing 90% of the progress.
    - Thane Ruthenis 28 Feb 2025 10:51 UTC
      LW: 26 AF: 10
      6
      AF Parent
      I think that if I consistently applied that argument, I’d end up thinking AGI was probably 5+ years away right up until the day AGI was announced.
      Point 1: That would not necessarily be incorrect; it’s not necessary that you ought to be able to do better than that. Consider math discoveries, which seem to follow a memoryless exponential distribution. Any given time period has a constant probability of a conjecture being proven, so until you observe it happening, it’s always a fixed number of years in the future. I think the position that this is how AGI development ought to be modeled is very much defensible.
      Indeed: if you place AGI in the reference class of self-driving cars/reusable rockets, you implicitly assume that the remaining challenges are engineering challenges, and that the paradigm of LLMs as a whole is sufficient to reach it. Then time-to-AGI could indeed be estimated more or less accurately.
      If we instead assume that some qualitative/theoretical/philosophical insight is still missing, then it becomes a scientific/mathematical challenge instead. The reference class of those is things like Millennium Problems, quantum computing (or, well, it was until recently?), fusion. And as above, the memes like “fusion is always X years away” is not necessarily evidence that there’s something wrong with how we do world-modeling.
      Point 2: DL is kind of different from other technologies. Here, we’re working against a selection process that’s eager to Goodhart to what we’re requesting, and we’re giving it an enormous amount of resources (compute) to spend on that. It might be successfully fooling us regarding how much progress is actually happening.
      One connection that comes to mind is the “just add epicycles” tragedy:
      Finally, I’m particularly struck by the superficial similarities between the way Ptolemy and Copernicus happened upon a general, overpowered tool for function approximation (Fourier analysis) that enabled them to misleadingly gerrymander false theories around the data, and the way modern ML has been criticized as an inscrutable heap of linear algebra and super-efficient GPUs. I haven’t explored whether these similarities go any deeper, but one implication seems to be that the power and versatility of deep learning might allow suboptimal architectures to perform deceivingly well (just like the power of epicycle-multiplication kept geocentrism alive) and hence distract us from uncovering the actual architectures underlying cognition and intelligence.
      That analogy seems incredibly potent to me.
      Another way to model time-to-AGI given the “deceitful” nature of DL might be to borrow some tools from sociology or economics, e. g. trying to time the market, predict when a social change will happen, or model what’s happening in a hostile epistemic environment. No clear analogy immediately comes to mind, though.
      - Daniel Kokotajlo 28 Feb 2025 18:45 UTC
        LW: 10 AF: 4
        3
        AF Parent
        Re: Point 1: I agree it would not necessarily be incorrect. I do actually think that probably the remaining challenges are engineering challenges. Not necessarily, but probably. Can you point to any challenges that seem (a) necessary for speeding up AI R&D by 5x, and (b) not engineering challenges?
        
        Re: Point 2: I don’t buy it. Deep neural nets are actually useful now, and increasingly so. Making them more useful seems analogous to selective breeding or animal training, not analogous to trying to time the market.
        Thane Ruthenis 3 Mar 2025 9:48 UTC
        LW: 6 AF: 3
        −2
        AF Parent
        Can you point to any challenges that seem (a) necessary for speeding up AI R&D by 5x, and (b) not engineering challenges?
        We’d discussed that some before, but one way to distill it is… I think autonomously doing nontrivial R&D engineering projects requires sustaining coherent agency across a large “inferential distance”. “Time” in the sense of “long-horizon tasks” is a solid proxy for it, but not really the core feature. Instead, it’s about being able to maintain a stable picture of the project even as you move from a fairly simple-in-terms-of-memorized-templates version of that project, to some sprawling, highly specific, real-life mess.
        My sense is that, even now, LLMs are terrible at this^[1] (including Anthropic’s recent coding agent), and that scaling along this dimension has not at all been good. So the straightforward projection of the current trends is not in fact “autonomous R&D agents in <3 years”, and some qualitative advancement is needed to get there.
        Making them more useful seems analogous to selective breeding or animal training
        Are they useful? Yes. Can they be made more useful? For sure. Is the impression that the rate at which they’re getting more useful would result in them 5x’ing AI R&D in <3 years a deceptive impression, the result of us setting up a selection process that would spit out something fooling us into forming this impression? Potentially yes, I argue.
        ^
        Having looked it up now, METR’s benchmark admits that the environments in which they test are unrealistically “clean”, such that, I imagine, solving the task correctly is the “path of least resistance” in a certain sense (see “systematic differences from the real world” here).
    - Kaj_Sotala 1 Mar 2025 12:33 UTC
      LW: 13 AF: 4
      3
      AF Parent
      I don’t know what your views on self-driving cars are, but if you are like me you look at what Waymo is doing and you think “Yep, it’s working decently well now, and they are scaling up fast, seems plausible that in a few years it’ll be working even better and scaled to every major city. The dream of robotaxis will be a reality, at least in the cities of America.”
      The example of self-driving cars is actually the biggest one that anchors me to timelines of decades or more. A lot of people’s impression after the 2007 DARPA Grand Challenge seemed to be something like “oh, we seem to know how to solve the problem in principle, now we just need a bit more engineering work to make it reliable and agentic in the real world”. Then actually getting things to be as reliable as required for real agents took a lot longer. So past experience would imply that going from “we know in principle how to make something act intelligently and agentically” to “this is actually a reliable real-world agent” can easily take over a decade.
      Another example is that going from the first in-principle demonstration of chain-of-thought to o1 took two years. That’s much shorter than a decade but also a much simpler capability.
      For general AI, I would expect the “we know how to solve things in principle” stage to at least be something like “can solve easy puzzles that a normal human can that the AI hasn’t been explicitly trained on”. Whereas with AI, we’re not even there yet. E.g. I tried giving GPT-4.5, DeepSeek R1, o3-mini, and Claude 3.7 with extended thinking a simple sliding square problem, and they all committed an illegal move at one stage or another.
      And that’s to say nothing about all the other capabilities that a truly general agent—say one capable of running a startup—would need, like better long-term memory, ability to formulate its own goals and prioritize between them in domains with no objective rules you could follow to guarantee success, etc.. Not only are we lacking convincing in-principle demonstrations of general intelligence within puzzle-like domains, we’re also lacking in-principle demonstrations of these other key abilities.
      - Petropolitan 6 Mar 2025 14:52 UTC
        7 points
        2
        Parent
        Another example is that going from the first in-principle demonstration of chain-of-thought to o1 took two years
        The correct date for the first demonstration of CoT is actually ~July 2020, soon after the GPT-3 release, see the related work review here: https://ar5iv.labs.arxiv.org/html/2102.07350
        Kaj_Sotala 6 Mar 2025 14:55 UTC
        2 points
        0
        Parent
        Thanks!
    - Cole Wyeth 28 Feb 2025 13:19 UTC
      4 points
      −7
      Parent
      I think I agree with Thane’s point 1: because it seems like building intelligence requires a series of conceptual insights, there may be limits to how far in advance I can know it’s about to happen (without like, already knowing how to build it out of math myself). But I don’t view this as a position of total epistemic helplessness—it’s clear that there has been a lot of progress over the last 40 years to the extent that we should be more than halfway there.
      And yeah, I don’t view AGI as equivalent to other technologies—its not even clear yet what all the technical problems that need to be solved are! I think it’s more like inventing a tiny mechanical bird than inventing a plane. Birds have probalby solved a lot of subproblems that we don’t know exist yet, and I’m really not sure how far we are from building an entire bird.
      - Thane Ruthenis 28 Feb 2025 17:23 UTC
        5 points
        3
        Parent
        But I don’t view this as a position of total epistemic helplessness—it’s clear that there has been a lot of progress over the last 40 years to the extent that we should be more than halfway there.
        Those are not incompatible. Suppose that you vaguely feel that a whole set of independent conceptual insights is missing, and that some of them will only be reachable after some previous ones have been discovered; e. g. you need to go $A \to B \to C$ . Then the expected time until the problem is solved is the sum of the expected wait-times $T_{A} + T_{B} + T_{C}$ , and if you observe $A$ and $B$ being solved, it shortens to $T_{C}$ .
        I think that checks out intuitively. We can very roughly gauge how “mature” a field is, and therefore, how much ground there’s likely to cover.
        Cole Wyeth 28 Feb 2025 17:30 UTC
        5 points
        0
        Parent
        Yes, I agree
  - Pekka Puupaa 28 Feb 2025 11:16 UTC
    5 points
    2
    Parent
    It’s been over ~40 years of progress since the perceptron, how do you know we’re in the last ~10% today?
    What would this heuristic have said about the probability of AlphaFold 2 solving protein folding in 2020? What about all the other tasks that had been untractable for decades that became solvable in the past five years?
    To me, 50% over the next 3 years is what sanity looks like.
    - Cole Wyeth 28 Feb 2025 13:25 UTC
      1 point
      2
      Parent
      What would your mindset have had to say about automated science in 2023, human level robots in 2024, AlphaFold curing cancer in 2025?
      - Pekka Puupaa 28 Feb 2025 13:48 UTC
        6 points
        4
        Parent
        My point is that that heuristic is not good. This obviously doesn’t mean that reversing the heuristic would give you good results (reverse stupidity is not intelligence and so on). What one needs is a different set of heuristics.
        If you extrapolate capability graphs in the most straightforward way, you get the result that AGI should arrive around 2027-2028. Scenario analyses (like the ones produced by Kokotajlo and Aschenbrenner) tend to converge on the same result.
        An effective cancer cure will likely require superintelligence, so I would be expecting one around 2029 assuming alignment gets solved.
        We mostly solved egg frying and laundry folding last year with Aloha and Optimus, which were some of the most long-standing issues in robotics. So human level robots in 2024 would actually have been an okay prediction. Actual human level probably requires human level intelligence, so 2027.
        Cole Wyeth 28 Feb 2025 14:53 UTC
        2 points
        0
        Parent
        I’m not actually relying on a heuristic, I’m compressing https://www.lesswrong.com/posts/vvgND6aLjuDR6QzDF/my-model-of-what-is-going-on-with-llms
        If you extrapolate capability graphs in the most straightforward way, you get the result that AGI should arrive around 2027-2028. Scenario analyses (like the ones produced by Kokotajlo and Aschenbrenner) tend to converge on the same result.
        If you extrapolate log GDP growth or the value of the S&P 500, superintelligence would not be anticipated any time soon. If you extrapolate then number of open mathematical theorems proved by LLMs you get ~a constant at 0. You have to decide which straight line you expect to stay straight—what Aschenbrenner did is not objective, and I don’t know about Kokotajlo but I doubt it was meaningfully independent.
        We mostly solved egg frying and laundry folding last year with Aloha and Optimus, which were some of the most long-standing issues in robotics. So human level robots in 2024 would actually have been an okay prediction. Actual human level probably requires human level intelligence, so 2027.
        Interesting, link?
        This reasoning feels a little motivated though—I think it would be obvious if we had human(-laborer)-level robots because they’d be walking around doing stuff. I’ve worked in robotics research a little bit and I can tell you that setting up a demo for an isolated task is VERY different from selling a product that can do it, let alone one product that can seamlessly transition between many tasks.
        Pekka Puupaa 28 Feb 2025 17:20 UTC
        4 points
        3
        Parent
        I’m not actually relying on a heuristic, I’m compressing https://www.lesswrong.com/posts/vvgND6aLjuDR6QzDF/my-model-of-what-is-going-on-with-llms
        Very interesting, thanks! On a quick skim, I don’t think I agree with the claim that LLMs have never done anything important. I know for a fact that they have written a lot of production code for a lot of companies, for example. And I personally have read AI texts funny or entertaining enough to reflect back on, and AI art beautiful enough to admire even a year later. (All of this is highly subjective, of course. I don’t think you’d find the same examples impressive.) If you don’t think any of that qualifies as important, then I think your definition of important may be overly broad.
        But I’ll have to look at this more deeply later.
        If you extrapolate log GDP growth or the value of the S&P 500, superintelligence would not be anticipated any time soon. If you extrapolate then number of open mathematical theorems proved by LLMs you get ~a constant at 0. You have to decide which straight line you expect to stay straight—what Aschenbrenner did is not objective, and I don’t know about Kokotajlo but I doubt it was meaningfully independent.
        I think this reasoning would also lead one to reject Moore’s law as a valid way to forecast future compute prices. It is in some sense “obvious” what straight lines one should be looking at: smooth lines of technological progress. I claim that you can pick just about any capability with a sufficiently “smooth”, “continuous” definition (i.e. your example of the number of open mathematical theorems solved would have to be amended to allow for partial progress and partial solutions) will tend to converge around 2027-28. Some converge earlier, some later, but that seems to be around the consensus for when we can expect human-level capability for nearly all tasks anybody’s bothered to model.
        Interesting, link?
        The Mobile Aloha website: https://mobile-aloha.github.io/
        The front page has a video of the system autonomously cooking a shrimp and other examples. It is still quite slow and clumsy, but being able to complete tasks like this at all is already light years ahead of where we were just a few years ago.
        I’ve worked in robotics research a little bit and I can tell you that setting up a demo for an isolated task is VERY different from selling a product that can do it, let alone one product that can seamlessly transition between many tasks.
        Oh, I know. It’s normally 5-20 years from lab to home. My 2027 prediction is for a research robot being able to do anything a human can do in an ordinary environment, not necessarily a mass-producable, inexpensive product for consumers or even most businesses. But obviously the advent of superintelligence, under my model, is going to accelerate those usual 5-20 year timelines quite a bit, so it can’t be much after 2027 that you’ll be able to buy your own android. Assuming “buying things” is still a thing, assuming the world remains recognizable for at least some years, and so on.
        Cole Wyeth 28 Feb 2025 17:29 UTC
        2 points
        0
        Parent
        Oh, I know. It’s normally 5-20 years from lab to home. My 2027 prediction is for a research robot being able to do anything a human can do in an ordinary environment, not necessarily a mass-producable, inexpensive product for consumers or even most businesses. But obviously the advent of superintelligence, under my model, is going to accelerate those usual 5-20 year timelines quite a bit, so it can’t be much after 2027 that you’ll be able to buy your own android. Assuming “buying things” is still a thing, assuming the world remains recognizable for at least some years, and so on.
        Okay, at this point perhaps we can just put some (fake) money on the line. Here are some example markets where we can provide each other liquidity, please feel free to suggest others:
- Nathan Helm-Burger 28 Feb 2025 11:15 UTC
  3 points
  1
  Parent
  Mine is still early 2027. My timeline is unchanged by the weak showing from GPT-4.5, because my timelines were already assuming that scaling would plateau. I was also already taking RL post-training and reasoning into account. This is what I was pointing at with my Manifold Markets about post-training fine-tuning plus scaffolding resulting in a substantial capability jump. My expectation of short timelines is that just something of approximately the current capability of existing SotA models (plus reasoning and research and scaffolds and agentic iterative refinement of hypotheses including critiquing sources) is sufficient for speedup of research into novel breakthroughs. I expect these novel breakthroughs to lead to AGI. See my discussion elsewhere of my ‘innovation overhang’ hypothesis. Also, see the discussion in the comments section of my post A Path to Human Autonomy