Nathan Helm-Burger comments on Jesse Hoogland’s Shortform

Nathan Helm-Burger 24 Jan 2025 19:28 UTC
5 points
0
Bringing in a quote from Twitter/x: (Not my viewpoint, just trying to broaden the discussion.)

https://x.com/DrJimFan/status/1882799254957388010

Jim Fan @DrJimFan Whether you like it or not, the future of AI will not be canned genies controlled by a “safety panel”. The future of AI is democratization. Every internet rando will run not just o1, but o8, o9 on their toaster laptop. It’s the tide of history that we should surf on, not swim against. Might as well start preparing now.

DeepSeek just topped Chatbot Arena, my go-to vibe checker in the wild, and two other independent benchmarks that couldn’t be hacked in advance (Artificial-Analysis, HLE).

Last year, there were serious discussions about limiting OSS models by some compute threshold. Turns out it was nothing but our Silicon Valley hubris. It’s a humbling wake-up call to us all that open science has no boundary. We need to embrace it, one way or another.

Many tech folks are panicking about how much DeepSeek is able to show with so little compute budget. I see it differently—with a huge smile on my face. Why are we not happy to see improvements in the scaling law? DeepSeek is unequivocal proof that one can produce unit intelligence gain at 10x less cost, which means we shall get 10x more powerful AI with the compute we have today and are building tomorrow. Simple math! The AI timeline just got compressed.

Here’s my 2025 New Year resolution for the community:

No more AGI/ASI urban myth spreading. No more fearmongering. Put our heads down and grind on code. Open source, as much as you can.

Acceleration is the only way forward.
- Milan W 24 Jan 2025 20:27 UTC
  4 points
  2
  Parent
  context: @DrJimFan works at nvidia
- Milan W 24 Jan 2025 20:17 UTC
  1 point
  −1
  Parent
  IF we got and will keep on having strong scaling law improvements, then:
  - ~~openai’s plan to continue to acquire way more training compute even into 2029 is either lies or a mistake~~
  - we’ll get very interesting times quite soon
  - offense-defense balances and multi-agent-system dynamics seem like good research directions, if you can research fast and have reason to believe your research will be implemented in a useful way
  EDIT: I no longer fully endorse the crossed-out bullet point. Details in replies to this comment.
  - Nathan Helm-Burger 24 Jan 2025 20:50 UTC
    4 points
    1
    Parent
    Disagree on pursuit of compute being a mistake in one of those worlds but not the other. Either way you are going to want as much inference as possible during key strategic moments.
    
    This seems even more critically important if you are worried your competitors will have algorithms nearly as good as yours.
    
    [Edit: roon posted the same thought on xitter the next day https://x.com/tszzl/status/1883076766232936730
    
    roon @tszzl
    
    if the frontier models are commoditized, compute concentration matters even more
    
    if you can train better models for fewer flops, compute concentration matters even more
    
    compute is the primary means of production of the future and owning more will always be good
    
    12:57 AM · Jan 25, 2025 roon @tszzl
    
    imo, open source models are a bit of a red herring on the path to acceptable asi futures. free model weights still don’t distribute power to all of humanity, they distribute it to the compute rich
    
    https://x.com/MikePFrank/status/1882999933126721617
    
    Michael P. Frank @MikePFrank
    
    Since R1 came out, people are talking like the massive compute farms deployed by Western labs are a waste, BUT THEY’RE NOT — don’t you see? This just means that once the best of DeepSeek’s clever cocktail of new methods are adopted by GPU-rich orgs, they’ll reach ASI even faster. ]
    - Milan W 24 Jan 2025 21:36 UTC
      3 points
      0
      Parent
      Agreed. However, in the fast world the game is extremely likely to end before you get to use 2029 compute.
      EDIT: I’d be very interested to hear an argument against this proposition, though.
      - Nathan Helm-Burger 24 Jan 2025 22:22 UTC
        2 points
        0
        Parent
        I don’t know if the plan is to have the compute from Stargate become available in incremental stages, or all at once in 2029.
        
        I expect timelines are shorter than that, but I’m not certain. If I were in OpenAI’s shoes, I’d want to hedge my bets. 2026 seems plausible. So does 2032. My peak expectation is sometime in 2027, but I wouldn’t want to go all-in on that.
        Milan W 24 Jan 2025 22:44 UTC
        3 points
        0
        Parent
        all at once in 2029.
        I am almost totally positive that the plan is not that.
        
        If planning for 2029 is cheap, then it probably makes sense under a very broad class of timelines expectations.
        If it is expensive, then the following applies to the hypothetical presented by the tweet:
        The timeline evoked in the tweet seems extremely fast and multipolar. I’d expect planning for 2029 compute scaling to make sense only if the current paradigm gets stuck at ~AGI capabilities level (ie a very good scaffolding for a model similar to but a bit smarter than o3). This is because if it scales further than that it will do so fast (requiring little compute, as the tweet suggests). If capabilities arbitrarily better than o4-with-good-scaffolding are compute-cheap to develop, then things almost certainly get very unpredictable before 2029.