habryka comments on Winning the power to lose

habryka 25 May 2025 17:25 UTC
8 points
0
I’d say that my all considered tradeoff curve is something like 0.1% existential risk per year of delay
For what it’s worth, from a societal perspective this seems very aggressive to me and a big outlier in human preferences. I would be extremely surprised if any government in the world would currently choose a 0.1% risk of extinction in order to accelerate AGI development by 1 year, if they actually faced that tradeoff directly. My guess is society-endorsed levels are closer to 0.01%.
- ryan_greenblatt 25 May 2025 17:41 UTC
  2 points
  0
  Parent
  As far as my views, it’s worth emphasizing that it depends on the current regime. I was supposing that at least the US was taking strong actions to resolve misalignment risk (which is resulting in many years of delay). In this regime, exogenous shocks might alter the situation such that powerful AI is developed under worse goverance. I’d guess the risk of an exogenous shock like this is around ~1% per year and there’s some substantial chance this would greatly increase risk. So, in the regime where the government is seriously considering the tradeoffs and taking strong actions, I’d guess 0.1% is closer to rational (if you don’t have a preference against the development of powerful AI regardless of misalignment risk which might be close to the preference of many people).
  
  I agree that governments in practice wouldn’t eat a known 0.1% existential risk to accelerate AGI development by 1 year, but also governments aren’t taking AGI seriously. Maybe you mean even if they better understood the situation and were acting rationally? I’m not so sure, see e.g. nuclear weapons where governments seemingly eat huge catastrophic risks which seem doable to mitigate at some cost. I do think status quo bias might be important here. Accelerating by 1 year which gets you 0.1% additional risk might be very different than delaying by 1 year which saves you 0.1%.
  
  (Separately, I think existential risk isn’t extinction risk and this might make a factor of 2 difference to the situation if you don’t care at all about anything other than current lives.)
  - habryka 25 May 2025 18:36 UTC
    2 points
    0
    Parent
    So, in the regime where the government is seriously considering the tradeoffs and taking strong actions, I’d guess 0.1% is closer to rational (if you don’t have a preference against the development of powerful AI regardless of misalignment risk which might be close to the preference of many people).
    Ah, sorry, if you are taking into account exogenous shifts in risk-attitudes and how careful people are, from a high baseline, I agree this makes sense. I was reading things as a straightforward 0.1% existential risk vs. 1 year of benefits from AI.
    - ryan_greenblatt 25 May 2025 18:42 UTC
      5 points
      2
      Parent
      Yeah, on the straightforward tradeoff (ignoring exogenous shifts/risks etc), I’m at more like 0.002% on my views.