Knight Lee comments on Six Thoughts on AI Safety

Knight Lee 25 Jan 2025 21:51 UTC
5 points
1
I think there’s a spectrum of belief regarding AGI power and danger.
There are people optimistic about AGI (but worry about bad human users):
- Eric Drexler (“Reframing Superintelligence” + LLMs + 4 years)
- A Solution for AGI/ASI Safety
- This post
They often think the “good AGI” will keep the “bad AGI” in check. I really disagree with that because
- The “population of AGI” is nothing like the population of humans, it is far more homogeneous because the most powerful AGI can just copy itself until it takes over most of the compute. If we fail to align them, different AGI will end up misaligned for the same reason.
- Eric Drexler envisions humans equipped with AI services acting as the good AGI. But having a human controlling enough decisions to ensure alignment will slow things down.
- If the first ASI is bad, it may build replicating machines/nanobots.
There are people who worry about slow takeoff risks:
- Redwood
  - This comment by Buck
  - Ryan Greenblatt’s comment above “winning a war against a rogue AI seems potentially doable, including a rogue AI which is substantially more capable than humans”
- Dan Hendrycks’s views on AGI selection pressure
- I think Anthropic’s view is here
- Eric Drexler again (Applying superintelligence without collusion)
- It looks like your comment is here
They are worried about “Von Neumann level AGI,” which poses a threat to humanity because they can build mirror bacteria and threaten humanity into following their will. The belief is that the war between it and humanity will be drawn out and uncertain, there may be negotiations.
They may imagine good AGI and bad AGI existing at the same time, but aren’t sure the good ones will win. Dan Hendryck’s view is the AGI will start off aligned, but humanity may become economically dependent on it and fall for its propaganda until it evolves into misalignment.
Finally, there are people who worry about fast takeoff risks:
- The Case Against AI Control Research
- MIRI
- Nick Bostrom
They believe that Von Neumann level AGI will not pose much direct risk, but they will be better at humans at AI research (imagine a million AI researchers), and will recursively self improve to superintelligence.
The idea is that AI research powered by the AI themselves will be limited by the speed of computers, not the speed of human neurons, so its speed might not be completely dissimilar to the speed of human research. Truly optimal AI research probably needs only tiny amounts of compute to reach superintelligence. DeepSeek’s cutting edge AI only took $6 million (supposedly) while four US companies spent around $210 billion on infrastructure (mostly for AI).
Superintelligence will not need to threaten humans with bioweapons or fight a protracted war. Once it actually escapes, it will defeat humanity with absolute ease. It can build self replicating nanofactories which grow as fast as bacteria and fungi, and which form body plans as sophisticated as animals.
Soon after it builds physical machines, it expands across the universe as close to the speed of light as physically possible.
These people worry about the first AGI/ASI being misaligned, but don’t worry about the second one as much because the first one would have already destroyed the world or saved the world permanently.
I consider myself split between the second group and third group.
- ryan_greenblatt 27 Jan 2025 2:18 UTC
  10 points
  3
  Parent
  FWIW, I think recusive self-improvment via just software (software only singularity) is reasonably likely to be feasible (perhaps 55%), but this alone doesn’t suffice for takeoff being arbitrary fast.
  
  Further, even objectively very fast takeoff (von Neumann to superintelligence in 6 months) can be enough time to win a war etc.
  - Knight Lee 27 Jan 2025 2:33 UTC
    2 points
    0
    Parent
    I agree, a lot of outcomes are possible and there’s no reason to think only fast takeoffs are dangerous+likely.
    Also I went too far saying that it “needs only tiny amounts of compute to reach superintelligence” without caveats. The $6 million is disputed by a video arguing that DeepSeek used far more compute than they admit to.
    - Vladimir_Nesov 27 Jan 2025 14:27 UTC
      7 points
      0
      Parent
      
      The $6 million is disputed by a video arguing that DeepSeek used far more compute than they admit to.
      
      The prior reference is a Dylan Patel tweet from Nov 2024, in the wake of R1-Lite-Preview release:
      
      Deepseek has over 50k Hopper GPUs to be clear.
      People need to stop acting like they only have that 10k A100 cluster.
      They are omega cracked on ML research and infra management but they aren’t doing it with that many fewer GPUs
      
      DeepSeek explicitly states that
      
      DeepSeek-V3 requires only 2.788M H800 GPU hours for its full training.
      
      This seems unlikely to be a lie, the reputational damage would’ve motivated not mentioning amount of compute instead, but the most interesting thing about DeepSeek-V3 is precisely this claim, that its quality is possible with so little compute.
      
      Certainly designing the architecture, the data mix, and the training process that made it possible required much more compute than the final training run, so in total it cost much more to develop than $6 million. And the 50K H100/H800 system is one way to go about that, though renting a bunch of 512-GPU instances from various clouds probably would’ve sufficed as well.
      - Knight Lee 27 Jan 2025 23:00 UTC
        1 point
        0
        Parent
        I see, thank you for the info!
        I don’t actually know about DeepSeek V3, I just felt “if I pointed out the $6 million claim in my argument, I shouldn’t hide the fact I watched a video which made myself doubt it.”
        I wanted to include the video as a caveat just in case the $6 million was wrong.
        Your explanation suggests the $6 million is still in the ballpark (for the final training run), so the concerns about a “software only singularity” are still very realistic.
- Noosphere89 26 Jan 2025 18:48 UTC
  6 points
  3
  Parent
  
  The “population of AGI” is nothing like the population of humans, it is far more homogeneous because the most powerful AGI can just copy itself until it takes over most of the compute. If we fail to align them, different AGI will end up misaligned for the same reason.
  
  I agree with this, with the caveat that synthetic data usage, especially targeted synthetic data as part of alignment efforts can make real differences in AI values, but yes this is a big factor people underestimate.
  - Bronson Schoen 26 Jan 2025 23:14 UTC
    1 point
    0
    Parent
    How does that relate to homogeniety?
    - Noosphere89 26 Jan 2025 23:23 UTC
      4 points
      0
      Parent
      Mostly, it means that while an AI produced in company is likely to have homogenous values and traits, different firms will have differing AI values, meaning that the values and experience for AIs will be drastically different inter-firm.

Knight Lee comments on Six Thoughts on AI Safety

There are people optimistic about AGI (but worry about bad human users):

There are people who worry about slow takeoff risks:

Finally, there are people who worry about fast takeoff risks: