Daniel Kokotajlo comments on Daniel Kokotajlo’s Shortform

Daniel Kokotajlo 28 Feb 2025 6:25 UTC
LW: 30 AF: 9
21
AF
Progress over the last 40 years has been not at all linear. I don’t think this “last 10%” thing is the right way to think about it.

The argument you make is tempting, I must admit I feel the pull of it. But I think it proves too much. I think that you will still be able to make that argument when AGI is, in fact, 3 years away. In fact you’ll still be able to make that argument when AGI is 3 months away. I think that if I consistently applied that argument, I’d end up thinking AGI was probably 5+ years away right up until the day AGI was announced.

Here’s another point. I think you are treating AGI as a special case. You wouldn’t apply this argument—this level of skepticism—to mundane technologies. For example, take self-driving cars. I don’t know what your views on self-driving cars are, but if you are like me you look at what Waymo is doing and you think “Yep, it’s working decently well now, and they are scaling up fast, seems plausible that in a few years it’ll be working even better and scaled to every major city. The dream of robotaxis will be a reality, at least in the cities of America.” Or consider SpaceX Starship. I’ve been following its development since, like, 2016, and it seems to me that it really will (probably but not definitely) be fully reusable in four years, even though this will require solving currently unsolved and unknown engineering problems. And I suspect that if I told you these predictions about Waymo and SpaceX, you’d nod along and say maybe you disagree a bit but you wouldn’t give this high-level argument about unknown unknowns and crossing 90% of the progress.
- Thane Ruthenis 28 Feb 2025 10:51 UTC
  LW: 26 AF: 10
  6
  AF Parent
  I think that if I consistently applied that argument, I’d end up thinking AGI was probably 5+ years away right up until the day AGI was announced.
  Point 1: That would not necessarily be incorrect; it’s not necessary that you ought to be able to do better than that. Consider math discoveries, which seem to follow a memoryless exponential distribution. Any given time period has a constant probability of a conjecture being proven, so until you observe it happening, it’s always a fixed number of years in the future. I think the position that this is how AGI development ought to be modeled is very much defensible.
  Indeed: if you place AGI in the reference class of self-driving cars/reusable rockets, you implicitly assume that the remaining challenges are engineering challenges, and that the paradigm of LLMs as a whole is sufficient to reach it. Then time-to-AGI could indeed be estimated more or less accurately.
  If we instead assume that some qualitative/theoretical/philosophical insight is still missing, then it becomes a scientific/mathematical challenge instead. The reference class of those is things like Millennium Problems, quantum computing (or, well, it was until recently?), fusion. And as above, the memes like “fusion is always X years away” is not necessarily evidence that there’s something wrong with how we do world-modeling.
  Point 2: DL is kind of different from other technologies. Here, we’re working against a selection process that’s eager to Goodhart to what we’re requesting, and we’re giving it an enormous amount of resources (compute) to spend on that. It might be successfully fooling us regarding how much progress is actually happening.
  One connection that comes to mind is the “just add epicycles” tragedy:
  Finally, I’m particularly struck by the superficial similarities between the way Ptolemy and Copernicus happened upon a general, overpowered tool for function approximation (Fourier analysis) that enabled them to misleadingly gerrymander false theories around the data, and the way modern ML has been criticized as an inscrutable heap of linear algebra and super-efficient GPUs. I haven’t explored whether these similarities go any deeper, but one implication seems to be that the power and versatility of deep learning might allow suboptimal architectures to perform deceivingly well (just like the power of epicycle-multiplication kept geocentrism alive) and hence distract us from uncovering the actual architectures underlying cognition and intelligence.
  That analogy seems incredibly potent to me.
  Another way to model time-to-AGI given the “deceitful” nature of DL might be to borrow some tools from sociology or economics, e. g. trying to time the market, predict when a social change will happen, or model what’s happening in a hostile epistemic environment. No clear analogy immediately comes to mind, though.
  - Daniel Kokotajlo 28 Feb 2025 18:45 UTC
    LW: 10 AF: 4
    3
    AF Parent
    Re: Point 1: I agree it would not necessarily be incorrect. I do actually think that probably the remaining challenges are engineering challenges. Not necessarily, but probably. Can you point to any challenges that seem (a) necessary for speeding up AI R&D by 5x, and (b) not engineering challenges?
    
    Re: Point 2: I don’t buy it. Deep neural nets are actually useful now, and increasingly so. Making them more useful seems analogous to selective breeding or animal training, not analogous to trying to time the market.
    - Thane Ruthenis 3 Mar 2025 9:48 UTC
      LW: 6 AF: 3
      −2
      AF Parent
      Can you point to any challenges that seem (a) necessary for speeding up AI R&D by 5x, and (b) not engineering challenges?
      We’d discussed that some before, but one way to distill it is… I think autonomously doing nontrivial R&D engineering projects requires sustaining coherent agency across a large “inferential distance”. “Time” in the sense of “long-horizon tasks” is a solid proxy for it, but not really the core feature. Instead, it’s about being able to maintain a stable picture of the project even as you move from a fairly simple-in-terms-of-memorized-templates version of that project, to some sprawling, highly specific, real-life mess.
      My sense is that, even now, LLMs are terrible at this^[1] (including Anthropic’s recent coding agent), and that scaling along this dimension has not at all been good. So the straightforward projection of the current trends is not in fact “autonomous R&D agents in <3 years”, and some qualitative advancement is needed to get there.
      Making them more useful seems analogous to selective breeding or animal training
      Are they useful? Yes. Can they be made more useful? For sure. Is the impression that the rate at which they’re getting more useful would result in them 5x’ing AI R&D in <3 years a deceptive impression, the result of us setting up a selection process that would spit out something fooling us into forming this impression? Potentially yes, I argue.
      ^
      Having looked it up now, METR’s benchmark admits that the environments in which they test are unrealistically “clean”, such that, I imagine, solving the task correctly is the “path of least resistance” in a certain sense (see “systematic differences from the real world” here).
- Kaj_Sotala 1 Mar 2025 12:33 UTC
  LW: 13 AF: 4
  3
  AF Parent
  I don’t know what your views on self-driving cars are, but if you are like me you look at what Waymo is doing and you think “Yep, it’s working decently well now, and they are scaling up fast, seems plausible that in a few years it’ll be working even better and scaled to every major city. The dream of robotaxis will be a reality, at least in the cities of America.”
  The example of self-driving cars is actually the biggest one that anchors me to timelines of decades or more. A lot of people’s impression after the 2007 DARPA Grand Challenge seemed to be something like “oh, we seem to know how to solve the problem in principle, now we just need a bit more engineering work to make it reliable and agentic in the real world”. Then actually getting things to be as reliable as required for real agents took a lot longer. So past experience would imply that going from “we know in principle how to make something act intelligently and agentically” to “this is actually a reliable real-world agent” can easily take over a decade.
  Another example is that going from the first in-principle demonstration of chain-of-thought to o1 took two years. That’s much shorter than a decade but also a much simpler capability.
  For general AI, I would expect the “we know how to solve things in principle” stage to at least be something like “can solve easy puzzles that a normal human can that the AI hasn’t been explicitly trained on”. Whereas with AI, we’re not even there yet. E.g. I tried giving GPT-4.5, DeepSeek R1, o3-mini, and Claude 3.7 with extended thinking a simple sliding square problem, and they all committed an illegal move at one stage or another.
  And that’s to say nothing about all the other capabilities that a truly general agent—say one capable of running a startup—would need, like better long-term memory, ability to formulate its own goals and prioritize between them in domains with no objective rules you could follow to guarantee success, etc.. Not only are we lacking convincing in-principle demonstrations of general intelligence within puzzle-like domains, we’re also lacking in-principle demonstrations of these other key abilities.
  - Petropolitan 6 Mar 2025 14:52 UTC
    7 points
    2
    Parent
    Another example is that going from the first in-principle demonstration of chain-of-thought to o1 took two years
    The correct date for the first demonstration of CoT is actually ~July 2020, soon after the GPT-3 release, see the related work review here: https://ar5iv.labs.arxiv.org/html/2102.07350
    - Kaj_Sotala 6 Mar 2025 14:55 UTC
      2 points
      0
      Parent
      Thanks!
- Cole Wyeth 28 Feb 2025 13:19 UTC
  4 points
  −7
  Parent
  I think I agree with Thane’s point 1: because it seems like building intelligence requires a series of conceptual insights, there may be limits to how far in advance I can know it’s about to happen (without like, already knowing how to build it out of math myself). But I don’t view this as a position of total epistemic helplessness—it’s clear that there has been a lot of progress over the last 40 years to the extent that we should be more than halfway there.
  And yeah, I don’t view AGI as equivalent to other technologies—its not even clear yet what all the technical problems that need to be solved are! I think it’s more like inventing a tiny mechanical bird than inventing a plane. Birds have probalby solved a lot of subproblems that we don’t know exist yet, and I’m really not sure how far we are from building an entire bird.
  - Thane Ruthenis 28 Feb 2025 17:23 UTC
    5 points
    3
    Parent
    But I don’t view this as a position of total epistemic helplessness—it’s clear that there has been a lot of progress over the last 40 years to the extent that we should be more than halfway there.
    Those are not incompatible. Suppose that you vaguely feel that a whole set of independent conceptual insights is missing, and that some of them will only be reachable after some previous ones have been discovered; e. g. you need to go $A \to B \to C$ . Then the expected time until the problem is solved is the sum of the expected wait-times $T_{A} + T_{B} + T_{C}$ , and if you observe $A$ and $B$ being solved, it shortens to $T_{C}$ .
    I think that checks out intuitively. We can very roughly gauge how “mature” a field is, and therefore, how much ground there’s likely to cover.
    - Cole Wyeth 28 Feb 2025 17:30 UTC
      5 points
      0
      Parent
      Yes, I agree