Milan W comments on Jesse Hoogland’s Shortform

Milan W 24 Jan 2025 20:17 UTC
1 point
−1
IF we got and will keep on having strong scaling law improvements, then:
- ~~openai’s plan to continue to acquire way more training compute even into 2029 is either lies or a mistake~~
- we’ll get very interesting times quite soon
- offense-defense balances and multi-agent-system dynamics seem like good research directions, if you can research fast and have reason to believe your research will be implemented in a useful way
EDIT: I no longer fully endorse the crossed-out bullet point. Details in replies to this comment.
- Nathan Helm-Burger 24 Jan 2025 20:50 UTC
  4 points
  1
  Parent
  Disagree on pursuit of compute being a mistake in one of those worlds but not the other. Either way you are going to want as much inference as possible during key strategic moments.
  
  This seems even more critically important if you are worried your competitors will have algorithms nearly as good as yours.
  
  [Edit: roon posted the same thought on xitter the next day https://x.com/tszzl/status/1883076766232936730
  
  roon @tszzl
  
  if the frontier models are commoditized, compute concentration matters even more
  
  if you can train better models for fewer flops, compute concentration matters even more
  
  compute is the primary means of production of the future and owning more will always be good
  
  12:57 AM · Jan 25, 2025 roon @tszzl
  
  imo, open source models are a bit of a red herring on the path to acceptable asi futures. free model weights still don’t distribute power to all of humanity, they distribute it to the compute rich
  
  https://x.com/MikePFrank/status/1882999933126721617
  
  Michael P. Frank @MikePFrank
  
  Since R1 came out, people are talking like the massive compute farms deployed by Western labs are a waste, BUT THEY’RE NOT — don’t you see? This just means that once the best of DeepSeek’s clever cocktail of new methods are adopted by GPU-rich orgs, they’ll reach ASI even faster. ]
  - Milan W 24 Jan 2025 21:36 UTC
    3 points
    0
    Parent
    Agreed. However, in the fast world the game is extremely likely to end before you get to use 2029 compute.
    EDIT: I’d be very interested to hear an argument against this proposition, though.
    - Nathan Helm-Burger 24 Jan 2025 22:22 UTC
      2 points
      0
      Parent
      I don’t know if the plan is to have the compute from Stargate become available in incremental stages, or all at once in 2029.
      
      I expect timelines are shorter than that, but I’m not certain. If I were in OpenAI’s shoes, I’d want to hedge my bets. 2026 seems plausible. So does 2032. My peak expectation is sometime in 2027, but I wouldn’t want to go all-in on that.
      - Milan W 24 Jan 2025 22:44 UTC
        3 points
        0
        Parent
        all at once in 2029.
        I am almost totally positive that the plan is not that.
        
        If planning for 2029 is cheap, then it probably makes sense under a very broad class of timelines expectations.
        If it is expensive, then the following applies to the hypothetical presented by the tweet:
        The timeline evoked in the tweet seems extremely fast and multipolar. I’d expect planning for 2029 compute scaling to make sense only if the current paradigm gets stuck at ~AGI capabilities level (ie a very good scaffolding for a model similar to but a bit smarter than o3). This is because if it scales further than that it will do so fast (requiring little compute, as the tweet suggests). If capabilities arbitrarily better than o4-with-good-scaffolding are compute-cheap to develop, then things almost certainly get very unpredictable before 2029.