Yeah, it generally confuses me why people who talk about post-alignment problems[1] tend not to support pause, and people who support pause tend not to talk much about these problems.
I have decided that my new hobbyhorse is getting people who talk about post-alignment problems to change their minds on pausing, or at minimum at least engage with the possibility instead of ignoring it. I’m actually not super confident that pause advocacy is the best move on the margin but at least I want people to consider it more seriously.
RE the name, my first draft called them “non-alignment problems”, but a reviewer said this makes it sounds like “the problem of AI not being aligned”. I spent a long time thinking about names and couldn’t come up with anything satisfying. “non-alignment AI x-risks” is too long IMO.
I think of post-alignment problems as “after” alignment in the sense that if you mess up ASI, then the problem that kills you first is misalignment.
I have decided that my new hobbyhorse is getting people who talk about post-alignment problems to change their minds on pausing, or at minimum at least engage with the possibility instead of ignoring it. I’m actually not super confident that pause advocacy is the best move on the margin but at least I want people to consider it more seriously.
RE the name, my first draft called them “non-alignment problems”, but a reviewer said this makes it sounds like “the problem of AI not being aligned”. I spent a long time thinking about names and couldn’t come up with anything satisfying. “non-alignment AI x-risks” is too long IMO.
I think of post-alignment problems as “after” alignment in the sense that if you mess up ASI, then the problem that kills you first is misalignment.