But I think there are various flavors of Sufficiently Scary Demos that will make the threat much more salient without needing to route through abstract arguments.
I suspect that the most plausible SSD would be a rogue AI replicating in the wild, as proposed by Alvin Anestrand. This AI-2027 fanwork has open-sourced AIs become able to replicate because someone will release a capable model. Then, according to the fanwork, this wilderness is infested by Agent-2, then Agent-4. Agent-4 continues the research in order to create Agent-5 and succeeds, obtaining a part of the lightcone.
My worse-case modification is that early open-sourced AIs will somehow have enough agency to work on the analogue of Agent-5 or Consensus-1. This worse-case scenario would be able to prompt the AI companies to race instead of slowing down, leading to a fiasco.
I think Sufficiently Scary Demos need to do something that
a) directly, clearly is capable of threatening specific world leaders from multiple nations at once, in a way that is viscerally salient to them specifically
b) but, you don’t end up going to jail for it (i.e. something like the difference between “a really good prank” and “actually hurting someone”)
c) ideally, relies as little as possible on general intelligence, as opposed to extremely powerful narrow stuff (with just enough general intelligent agency there to demonstrate that this is scary because it can be self-directed, as opposed to just being a weapon you want to make sure you control)
I suspect that the most plausible SSD would be a rogue AI replicating in the wild, as proposed by Alvin Anestrand. This AI-2027 fanwork has open-sourced AIs become able to replicate because someone will release a capable model. Then, according to the fanwork, this wilderness is infested by Agent-2, then Agent-4. Agent-4 continues the research in order to create Agent-5 and succeeds, obtaining a part of the lightcone.
My worse-case modification is that early open-sourced AIs will somehow have enough agency to work on the analogue of Agent-5 or Consensus-1. This worse-case scenario would be able to prompt the AI companies to race instead of slowing down, leading to a fiasco.
I think Sufficiently Scary Demos need to do something that
a) directly, clearly is capable of threatening specific world leaders from multiple nations at once, in a way that is viscerally salient to them specifically
b) but, you don’t end up going to jail for it (i.e. something like the difference between “a really good prank” and “actually hurting someone”)
c) ideally, relies as little as possible on general intelligence, as opposed to extremely powerful narrow stuff (with just enough general intelligent agency there to demonstrate that this is scary because it can be self-directed, as opposed to just being a weapon you want to make sure you control)