Thanks for writing this reflection, I found it useful.
Just to quickly comment on my own epistemic state here:
I haven’t read GD.
But I’ve been stewing on some of (what I think are) the same ideas for the last few months, when William Brandon first made (what I think are) similar arguments to me in October.
When I first heard these arguments, they struck me as quite important and outside of the wheelhouse of previous thinking on risks from AI development. I think they raise concerns that I don’t currently know how to refute around “even if we solve technical AI alignment, we still might lose control over our future.”
That said, I’m currently in a state of “I don’t know what to do about GD-type issues, but I have a lot of ideas about what to do about technical alignment.” For me at least, I think this creates an impulse to dismiss away GD-type concerns, so that I can justify continuing doing something where “the work cut out for me” (if not in absolute terms, then at least relative to working on GD-type issues).
In my case in particular I think it actually makes sense to keep working on technical alignment (because I think it’s going pretty productively).
But I think that other people who work (or are considering working in) technical alignment or governance should maybe consider trying to make progress on understanding and solving GD-type issues (assuming that’s possible).
Thanks for writing this reflection, I found it useful.
Just to quickly comment on my own epistemic state here:
I haven’t read GD.
But I’ve been stewing on some of (what I think are) the same ideas for the last few months, when William Brandon first made (what I think are) similar arguments to me in October.
(You can judge from this Twitter discussion whether I seem to get the core ideas)
When I first heard these arguments, they struck me as quite important and outside of the wheelhouse of previous thinking on risks from AI development. I think they raise concerns that I don’t currently know how to refute around “even if we solve technical AI alignment, we still might lose control over our future.”
That said, I’m currently in a state of “I don’t know what to do about GD-type issues, but I have a lot of ideas about what to do about technical alignment.” For me at least, I think this creates an impulse to dismiss away GD-type concerns, so that I can justify continuing doing something where “the work cut out for me” (if not in absolute terms, then at least relative to working on GD-type issues).
In my case in particular I think it actually makes sense to keep working on technical alignment (because I think it’s going pretty productively).
But I think that other people who work (or are considering working in) technical alignment or governance should maybe consider trying to make progress on understanding and solving GD-type issues (assuming that’s possible).