Cole Wyeth comments on Daniel Kokotajlo’s Shortform

Cole Wyeth 28 Feb 2025 2:39 UTC
LW: 10 AF: 7
−2
AF
It’s wild to me that you’ve concentrated a full 50% of your measure in the next <3 years. What if there are some aspects of intelligence which we don’t know we don’t know about yet? It’s been over ~40 years of progress since the perceptron, how do you know we’re in the last ~10% today?
- Daniel Kokotajlo 28 Feb 2025 6:25 UTC
  LW: 30 AF: 9
  21
  AF Parent
  Progress over the last 40 years has been not at all linear. I don’t think this “last 10%” thing is the right way to think about it.
  
  The argument you make is tempting, I must admit I feel the pull of it. But I think it proves too much. I think that you will still be able to make that argument when AGI is, in fact, 3 years away. In fact you’ll still be able to make that argument when AGI is 3 months away. I think that if I consistently applied that argument, I’d end up thinking AGI was probably 5+ years away right up until the day AGI was announced.
  
  Here’s another point. I think you are treating AGI as a special case. You wouldn’t apply this argument—this level of skepticism—to mundane technologies. For example, take self-driving cars. I don’t know what your views on self-driving cars are, but if you are like me you look at what Waymo is doing and you think “Yep, it’s working decently well now, and they are scaling up fast, seems plausible that in a few years it’ll be working even better and scaled to every major city. The dream of robotaxis will be a reality, at least in the cities of America.” Or consider SpaceX Starship. I’ve been following its development since, like, 2016, and it seems to me that it really will (probably but not definitely) be fully reusable in four years, even though this will require solving currently unsolved and unknown engineering problems. And I suspect that if I told you these predictions about Waymo and SpaceX, you’d nod along and say maybe you disagree a bit but you wouldn’t give this high-level argument about unknown unknowns and crossing 90% of the progress.
  - Thane Ruthenis 28 Feb 2025 10:51 UTC
    LW: 26 AF: 10
    6
    AF Parent
    I think that if I consistently applied that argument, I’d end up thinking AGI was probably 5+ years away right up until the day AGI was announced.
    Point 1: That would not necessarily be incorrect; it’s not necessary that you ought to be able to do better than that. Consider math discoveries, which seem to follow a memoryless exponential distribution. Any given time period has a constant probability of a conjecture being proven, so until you observe it happening, it’s always a fixed number of years in the future. I think the position that this is how AGI development ought to be modeled is very much defensible.
    Indeed: if you place AGI in the reference class of self-driving cars/reusable rockets, you implicitly assume that the remaining challenges are engineering challenges, and that the paradigm of LLMs as a whole is sufficient to reach it. Then time-to-AGI could indeed be estimated more or less accurately.
    If we instead assume that some qualitative/theoretical/philosophical insight is still missing, then it becomes a scientific/mathematical challenge instead. The reference class of those is things like Millennium Problems, quantum computing (or, well, it was until recently?), fusion. And as above, the memes like “fusion is always X years away” is not necessarily evidence that there’s something wrong with how we do world-modeling.
    Point 2: DL is kind of different from other technologies. Here, we’re working against a selection process that’s eager to Goodhart to what we’re requesting, and we’re giving it an enormous amount of resources (compute) to spend on that. It might be successfully fooling us regarding how much progress is actually happening.
    One connection that comes to mind is the “just add epicycles” tragedy:
    Finally, I’m particularly struck by the superficial similarities between the way Ptolemy and Copernicus happened upon a general, overpowered tool for function approximation (Fourier analysis) that enabled them to misleadingly gerrymander false theories around the data, and the way modern ML has been criticized as an inscrutable heap of linear algebra and super-efficient GPUs. I haven’t explored whether these similarities go any deeper, but one implication seems to be that the power and versatility of deep learning might allow suboptimal architectures to perform deceivingly well (just like the power of epicycle-multiplication kept geocentrism alive) and hence distract us from uncovering the actual architectures underlying cognition and intelligence.
    That analogy seems incredibly potent to me.
    Another way to model time-to-AGI given the “deceitful” nature of DL might be to borrow some tools from sociology or economics, e. g. trying to time the market, predict when a social change will happen, or model what’s happening in a hostile epistemic environment. No clear analogy immediately comes to mind, though.
    - Daniel Kokotajlo 28 Feb 2025 18:45 UTC
      LW: 10 AF: 4
      3
      AF Parent
      Re: Point 1: I agree it would not necessarily be incorrect. I do actually think that probably the remaining challenges are engineering challenges. Not necessarily, but probably. Can you point to any challenges that seem (a) necessary for speeding up AI R&D by 5x, and (b) not engineering challenges?
      
      Re: Point 2: I don’t buy it. Deep neural nets are actually useful now, and increasingly so. Making them more useful seems analogous to selective breeding or animal training, not analogous to trying to time the market.
      - Thane Ruthenis 3 Mar 2025 9:48 UTC
        LW: 6 AF: 3
        −2
        AF Parent
        Can you point to any challenges that seem (a) necessary for speeding up AI R&D by 5x, and (b) not engineering challenges?
        We’d discussed that some before, but one way to distill it is… I think autonomously doing nontrivial R&D engineering projects requires sustaining coherent agency across a large “inferential distance”. “Time” in the sense of “long-horizon tasks” is a solid proxy for it, but not really the core feature. Instead, it’s about being able to maintain a stable picture of the project even as you move from a fairly simple-in-terms-of-memorized-templates version of that project, to some sprawling, highly specific, real-life mess.
        My sense is that, even now, LLMs are terrible at this^[1] (including Anthropic’s recent coding agent), and that scaling along this dimension has not at all been good. So the straightforward projection of the current trends is not in fact “autonomous R&D agents in <3 years”, and some qualitative advancement is needed to get there.
        Making them more useful seems analogous to selective breeding or animal training
        Are they useful? Yes. Can they be made more useful? For sure. Is the impression that the rate at which they’re getting more useful would result in them 5x’ing AI R&D in <3 years a deceptive impression, the result of us setting up a selection process that would spit out something fooling us into forming this impression? Potentially yes, I argue.
        ^
        Having looked it up now, METR’s benchmark admits that the environments in which they test are unrealistically “clean”, such that, I imagine, solving the task correctly is the “path of least resistance” in a certain sense (see “systematic differences from the real world” here).
  - Kaj_Sotala 1 Mar 2025 12:33 UTC
    LW: 13 AF: 4
    3
    AF Parent
    I don’t know what your views on self-driving cars are, but if you are like me you look at what Waymo is doing and you think “Yep, it’s working decently well now, and they are scaling up fast, seems plausible that in a few years it’ll be working even better and scaled to every major city. The dream of robotaxis will be a reality, at least in the cities of America.”
    The example of self-driving cars is actually the biggest one that anchors me to timelines of decades or more. A lot of people’s impression after the 2007 DARPA Grand Challenge seemed to be something like “oh, we seem to know how to solve the problem in principle, now we just need a bit more engineering work to make it reliable and agentic in the real world”. Then actually getting things to be as reliable as required for real agents took a lot longer. So past experience would imply that going from “we know in principle how to make something act intelligently and agentically” to “this is actually a reliable real-world agent” can easily take over a decade.
    Another example is that going from the first in-principle demonstration of chain-of-thought to o1 took two years. That’s much shorter than a decade but also a much simpler capability.
    For general AI, I would expect the “we know how to solve things in principle” stage to at least be something like “can solve easy puzzles that a normal human can that the AI hasn’t been explicitly trained on”. Whereas with AI, we’re not even there yet. E.g. I tried giving GPT-4.5, DeepSeek R1, o3-mini, and Claude 3.7 with extended thinking a simple sliding square problem, and they all committed an illegal move at one stage or another.
    And that’s to say nothing about all the other capabilities that a truly general agent—say one capable of running a startup—would need, like better long-term memory, ability to formulate its own goals and prioritize between them in domains with no objective rules you could follow to guarantee success, etc.. Not only are we lacking convincing in-principle demonstrations of general intelligence within puzzle-like domains, we’re also lacking in-principle demonstrations of these other key abilities.
    - Petropolitan 6 Mar 2025 14:52 UTC
      7 points
      2
      Parent
      Another example is that going from the first in-principle demonstration of chain-of-thought to o1 took two years
      The correct date for the first demonstration of CoT is actually ~July 2020, soon after the GPT-3 release, see the related work review here: https://ar5iv.labs.arxiv.org/html/2102.07350
      - Kaj_Sotala 6 Mar 2025 14:55 UTC
        2 points
        0
        Parent
        Thanks!
  - Cole Wyeth 28 Feb 2025 13:19 UTC
    4 points
    −7
    Parent
    I think I agree with Thane’s point 1: because it seems like building intelligence requires a series of conceptual insights, there may be limits to how far in advance I can know it’s about to happen (without like, already knowing how to build it out of math myself). But I don’t view this as a position of total epistemic helplessness—it’s clear that there has been a lot of progress over the last 40 years to the extent that we should be more than halfway there.
    And yeah, I don’t view AGI as equivalent to other technologies—its not even clear yet what all the technical problems that need to be solved are! I think it’s more like inventing a tiny mechanical bird than inventing a plane. Birds have probalby solved a lot of subproblems that we don’t know exist yet, and I’m really not sure how far we are from building an entire bird.
    - Thane Ruthenis 28 Feb 2025 17:23 UTC
      5 points
      3
      Parent
      But I don’t view this as a position of total epistemic helplessness—it’s clear that there has been a lot of progress over the last 40 years to the extent that we should be more than halfway there.
      Those are not incompatible. Suppose that you vaguely feel that a whole set of independent conceptual insights is missing, and that some of them will only be reachable after some previous ones have been discovered; e. g. you need to go $A \to B \to C$ . Then the expected time until the problem is solved is the sum of the expected wait-times $T_{A} + T_{B} + T_{C}$ , and if you observe $A$ and $B$ being solved, it shortens to $T_{C}$ .
      I think that checks out intuitively. We can very roughly gauge how “mature” a field is, and therefore, how much ground there’s likely to cover.
      - Cole Wyeth 28 Feb 2025 17:30 UTC
        5 points
        0
        Parent
        Yes, I agree
- Pekka Puupaa 28 Feb 2025 11:16 UTC
  5 points
  2
  Parent
  It’s been over ~40 years of progress since the perceptron, how do you know we’re in the last ~10% today?
  What would this heuristic have said about the probability of AlphaFold 2 solving protein folding in 2020? What about all the other tasks that had been untractable for decades that became solvable in the past five years?
  To me, 50% over the next 3 years is what sanity looks like.
  - Cole Wyeth 28 Feb 2025 13:25 UTC
    1 point
    2
    Parent
    What would your mindset have had to say about automated science in 2023, human level robots in 2024, AlphaFold curing cancer in 2025?
    - Pekka Puupaa 28 Feb 2025 13:48 UTC
      6 points
      4
      Parent
      My point is that that heuristic is not good. This obviously doesn’t mean that reversing the heuristic would give you good results (reverse stupidity is not intelligence and so on). What one needs is a different set of heuristics.
      If you extrapolate capability graphs in the most straightforward way, you get the result that AGI should arrive around 2027-2028. Scenario analyses (like the ones produced by Kokotajlo and Aschenbrenner) tend to converge on the same result.
      An effective cancer cure will likely require superintelligence, so I would be expecting one around 2029 assuming alignment gets solved.
      We mostly solved egg frying and laundry folding last year with Aloha and Optimus, which were some of the most long-standing issues in robotics. So human level robots in 2024 would actually have been an okay prediction. Actual human level probably requires human level intelligence, so 2027.
      - Cole Wyeth 28 Feb 2025 14:53 UTC
        2 points
        0
        Parent
        I’m not actually relying on a heuristic, I’m compressing https://www.lesswrong.com/posts/vvgND6aLjuDR6QzDF/my-model-of-what-is-going-on-with-llms
        If you extrapolate capability graphs in the most straightforward way, you get the result that AGI should arrive around 2027-2028. Scenario analyses (like the ones produced by Kokotajlo and Aschenbrenner) tend to converge on the same result.
        If you extrapolate log GDP growth or the value of the S&P 500, superintelligence would not be anticipated any time soon. If you extrapolate then number of open mathematical theorems proved by LLMs you get ~a constant at 0. You have to decide which straight line you expect to stay straight—what Aschenbrenner did is not objective, and I don’t know about Kokotajlo but I doubt it was meaningfully independent.
        We mostly solved egg frying and laundry folding last year with Aloha and Optimus, which were some of the most long-standing issues in robotics. So human level robots in 2024 would actually have been an okay prediction. Actual human level probably requires human level intelligence, so 2027.
        Interesting, link?
        This reasoning feels a little motivated though—I think it would be obvious if we had human(-laborer)-level robots because they’d be walking around doing stuff. I’ve worked in robotics research a little bit and I can tell you that setting up a demo for an isolated task is VERY different from selling a product that can do it, let alone one product that can seamlessly transition between many tasks.
        Pekka Puupaa 28 Feb 2025 17:20 UTC
        4 points
        3
        Parent
        I’m not actually relying on a heuristic, I’m compressing https://www.lesswrong.com/posts/vvgND6aLjuDR6QzDF/my-model-of-what-is-going-on-with-llms
        Very interesting, thanks! On a quick skim, I don’t think I agree with the claim that LLMs have never done anything important. I know for a fact that they have written a lot of production code for a lot of companies, for example. And I personally have read AI texts funny or entertaining enough to reflect back on, and AI art beautiful enough to admire even a year later. (All of this is highly subjective, of course. I don’t think you’d find the same examples impressive.) If you don’t think any of that qualifies as important, then I think your definition of important may be overly broad.
        But I’ll have to look at this more deeply later.
        If you extrapolate log GDP growth or the value of the S&P 500, superintelligence would not be anticipated any time soon. If you extrapolate then number of open mathematical theorems proved by LLMs you get ~a constant at 0. You have to decide which straight line you expect to stay straight—what Aschenbrenner did is not objective, and I don’t know about Kokotajlo but I doubt it was meaningfully independent.
        I think this reasoning would also lead one to reject Moore’s law as a valid way to forecast future compute prices. It is in some sense “obvious” what straight lines one should be looking at: smooth lines of technological progress. I claim that you can pick just about any capability with a sufficiently “smooth”, “continuous” definition (i.e. your example of the number of open mathematical theorems solved would have to be amended to allow for partial progress and partial solutions) will tend to converge around 2027-28. Some converge earlier, some later, but that seems to be around the consensus for when we can expect human-level capability for nearly all tasks anybody’s bothered to model.
        Interesting, link?
        The Mobile Aloha website: https://mobile-aloha.github.io/
        The front page has a video of the system autonomously cooking a shrimp and other examples. It is still quite slow and clumsy, but being able to complete tasks like this at all is already light years ahead of where we were just a few years ago.
        I’ve worked in robotics research a little bit and I can tell you that setting up a demo for an isolated task is VERY different from selling a product that can do it, let alone one product that can seamlessly transition between many tasks.
        Oh, I know. It’s normally 5-20 years from lab to home. My 2027 prediction is for a research robot being able to do anything a human can do in an ordinary environment, not necessarily a mass-producable, inexpensive product for consumers or even most businesses. But obviously the advent of superintelligence, under my model, is going to accelerate those usual 5-20 year timelines quite a bit, so it can’t be much after 2027 that you’ll be able to buy your own android. Assuming “buying things” is still a thing, assuming the world remains recognizable for at least some years, and so on.
        Cole Wyeth 28 Feb 2025 17:29 UTC
        2 points
        0
        Parent
        Oh, I know. It’s normally 5-20 years from lab to home. My 2027 prediction is for a research robot being able to do anything a human can do in an ordinary environment, not necessarily a mass-producable, inexpensive product for consumers or even most businesses. But obviously the advent of superintelligence, under my model, is going to accelerate those usual 5-20 year timelines quite a bit, so it can’t be much after 2027 that you’ll be able to buy your own android. Assuming “buying things” is still a thing, assuming the world remains recognizable for at least some years, and so on.
        Okay, at this point perhaps we can just put some (fake) money on the line. Here are some example markets where we can provide each other liquidity, please feel free to suggest others: