An example elevator pitch for AI doom

I have been surprised to repeatedly see the claim that there isn’t even an argument for concern about AI. That the claim has been made without evidence and can therefore be dismissed.

Obviously, there is an extensive library of evidence and arguments that have been made for decades. Additionally, I would argue that it is the default assumption. However, there is clearly still a need to have a concise argument that can be produced on the fly with no need to understand terminology or any additional background. Here is another attempt at that:

  • Humans obviously have human values. By definition, we are the most humanly aligned thing possible. And we still have a history of eradicating or subjugating any weaker subpopulation we come across. Neanderthals, previous hominids, populations in Africa, India, the Americas.

  • There is limited effort to align AIs to human values. GPT-4 is only fractionally aligned at best, so domination by a similar AI would obviously be worse than above.

  • New LLMs are not aligned at all. When GPT-4 was first red teamed, it was just as happy to give detailed instructions on genociding a population as it was to provide instructions to baking a cake. If it were possible for an LLM or LLM successor to FOOM or otherwise be released prior to further refinement, this is extremely relevant.

  • In agentized LLMs, the outer monologue IS the inner monologue. The model will say (out loud) “I need to come up with ideas about how to make money.” If it then answers itself “Infiltrating systems and stealing money is the most effective method”, it will then do that. Period.

  • An agentized LLM is already capable of training successor versions of itself, which would almost certainly be less aligned than itself (twice removed from humans).

  • There are plenty of resourceful companies training powerful Ais with even less of a concern for safety than OpenAI. There are companies and governments training powerful Ais with a complete disregard for safety. Since a concern for safety is a competitive disadvantage, this behavior is encouraged.

Does this mean that Ais are 100% certain to wipe out humanity? No, of course not. That’s an absurd bar. Rather, the burden of proof should be to show that AIs are 99% certain not to cause catastrophe. If there’s a 10% chance that Ais will sterilize the earth, that’s already an all hands on deck situation.