Sam Marks comments on Gradual Disempowerment, Shell Games and Flinches

Sam Marks 3 Feb 2025 2:12 UTC
LW: 18 AF: 10
6
AF
Thanks for writing this reflection, I found it useful.
Just to quickly comment on my own epistemic state here:
1. I haven’t read GD.
2. But I’ve been stewing on some of (what I think are) the same ideas for the last few months, when William Brandon first made (what I think are) similar arguments to me in October.
  1. (You can judge from this Twitter discussion whether I seem to get the core ideas)
3. When I first heard these arguments, they struck me as quite important and outside of the wheelhouse of previous thinking on risks from AI development. I think they raise concerns that I don’t currently know how to refute around “even if we solve technical AI alignment, we still might lose control over our future.”
4. That said, I’m currently in a state of “I don’t know what to do about GD-type issues, but I have a lot of ideas about what to do about technical alignment.” For me at least, I think this creates an impulse to dismiss away GD-type concerns, so that I can justify continuing doing something where “the work cut out for me” (if not in absolute terms, then at least relative to working on GD-type issues).
5. In my case in particular I think it actually makes sense to keep working on technical alignment (because I think it’s going pretty productively).
6. But I think that other people who work (or are considering working in) technical alignment or governance should maybe consider trying to make progress on understanding and solving GD-type issues (assuming that’s possible).