I often hear people dismiss AI control by saying something like, “most AI risk doesn’t come from early misaligned AGIs.” While I mostly agree with this point, I think it fails to engage with a bunch of the more important arguments in favor of control— for instance, the fact that catching misaligned actions might be extremely important for alignment. In general, I think that control has a lot of benefits that are very instrumentally useful for preventing misaligned ASIs down the line, and I wish people would more thoughtfully engage with these.
I often hear people dismiss AI control by saying something like, “most AI risk doesn’t come from early misaligned AGIs.” While I mostly agree with this point, I think it fails to engage with a bunch of the more important arguments in favor of control— for instance, the fact that catching misaligned actions might be extremely important for alignment. In general, I think that control has a lot of benefits that are very instrumentally useful for preventing misaligned ASIs down the line, and I wish people would more thoughtfully engage with these.