Daniel Kokotajlo comments on Contradict my take on OpenPhil’s past AI beliefs

Daniel Kokotajlo 21 Dec 2025 15:59 UTC
36 points
3
Indeed I do think of it that way.
- habryka 21 Dec 2025 23:09 UTC
  7 points
  −5
  Parent
  FWIW, I continue to think your models here are obviously not building on Cotra’s thing, and think something pretty weird is going on when you say they do. Which is not like catastrophic, but I think the credit allocation here feels quite weird.
- Eliezer Yudkowsky 21 Dec 2025 16:20 UTC
  6 points
  0
  Parent
  Is your take “Use these different parameters and you get AGI in 2028 with the current methods”?
  - Daniel Kokotajlo 22 Dec 2025 20:19 UTC
    18 points
    3
    Parent
    At the time, iirc, I went through Ajeya’s spreadsheet and thought about each parameter and twiddled them to be more correct-according-to-me, and got something like median 2030 at the end.
  - StanislavKrym 21 Dec 2025 16:27 UTC
    1 point
    0
    Parent
    As far as Kokotajlo’s memory can be trusted after the ChatGPT moment, he thought that there would be a 50% chance to reach AGI in 2030.
    - Eliezer Yudkowsky 21 Dec 2025 17:24 UTC
      10 points
      −5
      Parent
      If you can get that or 2050 equally well off yelling “Biological Anchoring”, why not admit that the intuition comes first and then you hunt around for parameters you like? This doesn’t sound like good methodology to me.
      What links here?
      StanislavKrym's comment on Why does Eliezer make abrasive public comments? by k64 (23 Dec 2025 2:30 UTC; -5 points)
      - Daniel Kokotajlo 22 Dec 2025 20:22 UTC
        20 points
        0
        Parent
        I don’t think the intuition came first? I think it was playing around with the model that caused my intuitions to shift, not the other way around. Hard to attribute exactly ofc.
        
        Anyhow, I certainly don’t deny that there’s a big general tendency for people to fit models to their intuitions. I think you are falsely implying that I do deny that. I don’t know if I’ve loudly stated it publicly before but I definitely am aware of that and have been for years, and I’m embracing it in fact—the model is a helpful tool for articulating and refining and yes sometimes changing my intuitions, but the intuitions still play a central role. I’ll try to state that more loudly in future releases.
      - StanislavKrym 21 Dec 2025 19:05 UTC
        1 point
        −2
        Parent
        One can apply similar methodological arguments to a different problem and test whether they persist. The amount of civilisations not in the Solar Systems is thought to be estimatable via the Drake equation. Drake’s original estimates implied that the Milky Way contains between 1K and 100M civilisations. The only ground truth that we know is the fact that we have yet to find any reliable evidence of such civilisations. But I don’t understand where the equation itself is erroneous.
        Returning to the AI timeline crux, Cotra’s idea was the following. TAI is created once someone spends enough compute. Cotra’s main idea was that compute_required = (compute_under_2020_knowledge)/(knowledge_factor(t)), compute_affordable increases exponentially before the bottlenecks related to the world’s economy. The estimates on compute_affordable required mankind only to keep track of who produces it, how it is done and who is willing to pay. A similar procedure was done in the AI-2027 compute forecast.
        Then Cotra proceeded to wildly misestimate. Her idea of the knowledge factor was that it makes creation of the TAI twice easier every 2-3 years, which I doubt that I would understand for reasons described in the collapsed sections. Cotra’s ideas on compute_under_2020_knowledge are total BS for reasons I detailed in another comment. Therefore, I fail to understand where Cotra was mistaken aside from using parameters that are total BS. Nor do I think that if Cotra’s model was correct aside from BSed parameters, it wouldn’t be a natural move to correct the parameters.
        Cotra’s rationalisation for the TAI to become twice as easy to create every few years
        I consider two types of algorithmic progress: relatively incremental and steady progress from iteratively improving architectures and learning algorithms, and the chance of “breakthrough” progress which brings the technical difficulty of training a transformative model down from “astronomically large” / “impossible” to “broadly feasible.”
        For incremental progress, the main source I used was Hernandez and Brown 2020, “Measuring the Algorithmic Efficiency of Neural Networks.” The authors reimplemented open source state-of-the-art (SOTA) ImageNet models between 2012 and 2019 (six models in total). They trained each model up to the point that it achieved the same performance as AlexNet achieved in 2012, and recorded the total FLOP that required. They found that the SOTA model in 2019, EfficientNet B0, required ~44 times fewer training FLOP to achieve AlexNet performance than AlexNet did; the six data points fit a power law curve with the amount of computation required to match AlexNet halving every ~16 months over the seven years in the dataset. They also show that linear programming displayed a similar trend over a longer period of time: when hardware is held fixed, the time in seconds taken to solve a standard basket of mixed integer programs by SOTA commercial software packages halved every ~13 months over the 21 years from 1996 to 2017.
        Grace 2013 (“Algorithmic Progress in Six Domains”) is the only other paper attempting to systematically quantify algorithmic progress that I am currently aware of, although I have not done a systematic literature review and may be missing others. I have chosen not to examine it in detail because a) it was written largely before the deep learning boom and mostly does not focus on ML tasks, and b) it is less straightforward to translate Grace’s results into the format that I am most interested in (“How has the amount of computation required to solve a fixed task decreased over time?”). Paul is familiar with the results, and he believes that algorithmic progress across the six domains studied in Grace 2013 is consistent with a similar but slightly slower rate of progress, ranging from 13 to 36 months to halve the computation required to reach a fixed level of performance.
        This means that the compute required is halving every 16 months for AlexNet, every 13 months for linear programming. While Claude Opus 4.5 does seem to think that Paul’s belief is close to what Grace’s paper implies, the paper’s relevance is likely undermined by Cotra’s own criticism. Next Cotra listed actual assumptions for each of the models, including the two clearly BSed ones. I marked her ideas on when the required compute is halved in bold:
        Cotra’s actual assumptions
        I assumed that:
        Training FLOP requirements for the Lifetime Anchor hypothesis (red) are halving once every 3.5 years and there is only room to improve by ~2 OOM from the 2020 level—moving from a median of ~1e28 in 2020 to ~1e26 by 2100.
        Training FLOP requirements for the Short horizon neural network hypothesis (orange) are halving once every 3 years and there is room to improve by ~2 OOM from the 2020 level—moving from a median of ~1e31 in 2020 to ~3e29 by 2100.
        Training FLOP requirements for the Genome Anchor hypothesis (yellow) are halving once every 3 years and there is room to improve by ~3 OOM from the 2020 level—moving from a median of ~3e33 in 2020 to ~3e30 by 2100.
        Training FLOP requirements for the Medium-horizon neural network hypothesis (green) are halving once every 2 years and there is room to improve by ~3 OOM from the 2020 level—moving from a median of ~3e34 in 2020 to ~3e31 by 2100.
        Training FLOP requirements for the Long-horizon neural network hypothesis (blue) are halving once every 2 years and there is room to improve by ~4 OOM from the 2020 level—moving from a median of ~1e38 in 2020 to ~1e34 by 2100.
        Training FLOP requirements for the Evolution Anchor hypothesis (purple) are halving once every 2 years and there is room to improve by ~5 OOM from the 2020 level—moving from a median of ~1e41 in 2020 to ~1e36 by 2100.