Thomas Kwa comments on Alignment will happen by default. What’s next?

Thomas Kwa 25 Nov 2025 23:02 UTC
LW: 25 AF: 9
−4
AF
Thanks for writing this, I’ve suspected for a while that we’re ahead, which is great but a bit emotionally difficult when I’d spent basically my whole career with the goal of heroically solving an almost impossible problem. And this is a common view among AI safety people, e.g. Ethan Perez said recently he’s focused on problems other than takeover due to not being as worried about it.
I do expect some instrumental pressure towards misaligned power-seeking, but the number of tools we have to understand, detect, and prevent it is now large enough that it seems we’ll be fine until the Dyson sphere is under construction, at which point things are a lot more uncertain but probably we’ll figure something out there too.
- Adrià Garriga-alonso 25 Nov 2025 23:21 UTC
  LW: 9 AF: 2
  1
  AF Parent
  
  which is great but a bit emotionally difficult when I’d spent basically my whole career with the goal of heroically solving an almost impossible problem
  
  I feel exactly the same way. You have put it into words perfectly.
  
  I want to make sure we’re not fooling ourselves into perpetuating a problem that might not exist but gives us ‘meaning’, as many nonprofits are wont to do; but at the same time I’m not confident that the problem is solved and we should still be watchful. But it feels a lot less meaningful to go from “hero that will solve alignment” to “lol Paul Christiano solved alignment in 2017, we just didn’t know it yet, we’re just being careful now”. And it’s important to be careful, especially with astronomical stakes!! But it feels less meaningful.