But I’m not sure. And it seems like this could use more thought.
I don’t think the solar-system scenario you seem to imagine would be durable. Some asshole with an intent-aligned AGI (or who is a misaligned AGI) will figure out how to send the sun nova while they go elsewhere in hopes of claiming the rest of the lightcone.
Let’s just say that Alvin Anestrand didexplore the issues in detail. His take has the USA and China coordinate and create an aligned ASI while the misaligned Agent-4 escapes, takes over the world of rogues and includes itself into the ASI ruling the lightcone. I would also like @Dev.Errata to sketch an alternate scenario explaining how the emergence of the ASI can be prevented. Or the original AI-2027 team to take a look into the Rogue Replication scenario.
However, I did propose a mechanism which could prevent rogue replication. Suppose that the humans at OpenBrain ask the AIs to practice finetuning open-sourced models on bioweapons data and report if open-sourced models can be used for this purpose. Then OpenBrain could create the laws requiring any American company to test the models before publication and to face compute confiscations unless the model is too dumb to proliferate or to be used by terrorists.
Those links were both incredibly useful, thank you! The Rogue Replication timeline is very similar in thrust to the post I was working on when I saw this, but worked out in detail and well-written and thoroughly researched. Your proposed mechanism probably should not be deployed; I agree with the conclusion that rogue replication (or other obviously misaligned AI) is probably useful in elevating public concern about AI alignment as a risk.
Let’s just say that Alvin Anestrand did explore the issues in detail. His take has the USA and China coordinate and create an aligned ASI while the misaligned Agent-4 escapes, takes over the world of rogues and includes itself into the ASI ruling the lightcone. I would also like @Dev.Errata to sketch an alternate scenario explaining how the emergence of the ASI can be prevented. Or the original AI-2027 team to take a look into the Rogue Replication scenario.
However, I did propose a mechanism which could prevent rogue replication. Suppose that the humans at OpenBrain ask the AIs to practice finetuning open-sourced models on bioweapons data and report if open-sourced models can be used for this purpose. Then OpenBrain could create the laws requiring any American company to test the models before publication and to face compute confiscations unless the model is too dumb to proliferate or to be used by terrorists.
Those links were both incredibly useful, thank you! The Rogue Replication timeline is very similar in thrust to the post I was working on when I saw this, but worked out in detail and well-written and thoroughly researched. Your proposed mechanism probably should not be deployed; I agree with the conclusion that rogue replication (or other obviously misaligned AI) is probably useful in elevating public concern about AI alignment as a risk.