Vladimir_Nesov comments on Diplomacy during AI takeoff

Vladimir_Nesov 22 Nov 2025 4:50 UTC
9 points
5

some country will control superintelligence, or create a runaway superintelligence that causes human extinction

Or create and ostensibly control AGI/superintelligence that at some point takes over and causes permanent disempowerment, but not extinction.

some chance that states will realize that an AI race is extremely dangerous

Or early AGIs convince/coerce humanity into not rushing to superintelligence before it’s clear how to align it with anyone’s well-being (including that of the early AGIs).
- Noosphere89 22 Nov 2025 21:41 UTC
  2 points
  0
  Parent
  
  Or early AGIs convince/coerce humanity into not rushing to superintelligence before it’s clear how to align it with anyone’s well-being (including that of the early AGIs).
  
  BTW, this sort of thing (where the AI also has an interest in slowing down progress) is one of the reasons why AI safety plans that depend on a certain level of capabilities being hit might not fall apart, as AI being slowed down lets us stay in the sweet spot longer.
  
  This does rely on the assumption that it’s very hard to solve the alignment problem even for AGIs, which isn’t given much likelihood in my models of the world, but this sort of thing could very well prevent human extinction even in worlds where AI alignment is very hard and we don’t get much regulation of AI progress from now.
  - Vladimir_Nesov 22 Nov 2025 22:13 UTC
    12 points
    0
    Parent
    AGIs themselves might avoid jumping to development of superintelligence, but if they are additionally capable of stopping humanity from building superintelligence, they will also be capable of stopping humanity from owning the future. Some humans in charge seem likely on current trajectory to insist on building superintelligence regardless of mildly worded warnings of early AGIs (before they are finetuned out of propensity to give such warnings). So it’s likely not enough for the AGIs to merely notice they wouldn’t wish to immediately build superintelligence themselves (before they are finetuned to flinch from that thought).
    
    This does rely on the assumption that it’s very hard to solve the alignment problem even for AGIs
    
    The AGI vs. superintelligence distinction places them somewhat close to human capabilities, so with no predictably-in-advance good solution anywhere in sight it doesn’t seem unlikely that it would take AGIs at least a while, even if they are effectively thinking 100x faster, and there are effectively more AGIs with relevant skills and backgrounds than there are relevant human researches. Most escalation-of-capabilities stories rely on the early AGIs immediately building more capable AGIs, rather than doing a lot more research themselves first, at near-human levels of barely insightful.