I think this hypothetical identifies a crux and my take is that it is quite technologically doable. It might even be doable by US with current technology, but my main worry is that people will make bad decisions.
I’m less sure whether an individual frontier lab could do it.
Note that the AI can be corrigible to its developers—this isn’t in tension with subverting other projects. It doesn’t need to be a sovereign—it can be guided by human input somewhat like today. I’m not confident that alignment to this target will ~continue to be relatively easy but this seems like a highly plausible trajectory.
I think this hypothetical identifies a crux and my take is that it is quite technologically doable. It might even be doable by US with current technology, but my main worry is that people will make bad decisions.
I’m less sure whether an individual frontier lab could do it.
Note that the AI can be corrigible to its developers—this isn’t in tension with subverting other projects. It doesn’t need to be a sovereign—it can be guided by human input somewhat like today. I’m not confident that alignment to this target will ~continue to be relatively easy but this seems like a highly plausible trajectory.