What are the flaws in this argument about p(Doom)?

Technical alignment is hard

Technical alignment will take 5+ years

AI capabilities are currently subhuman in some areas (driving cars), about human in some areas (Bar exam), and superhuman in some areas (playing chess)

Capabilities scale with compute

The doubling time for AI compute is ~6 months

In 5 years compute will scale 2^(5÷0.5)=1024 times

In 5 years, with ~1024 times the compute, AI will be superhuman at most tasks including designing AI

Designing a better version of itself will increase an AI’s reward function

An AI will design a better version of itself and recursively loop this process until it reaches some limit

Such any AI will be superhuman at almost all tasks, including computer security, R&D, planning, and persuasion

The AI will deploy these skills to increase its reward function

Human survival is not in the AIs reward function

The AI will kill of most or all humans to prevent the humans from possibly decreasing its reward function

Therefore: p(Doom) is high within 5 years


Despite what the title says this is not a perfect argument tree. Which part do you think is the most flawed?

Edit: As per request the title has been changed from the humourous “An utterly perfect argument about p(Doom)” to “What are the flaws in this argument about p(Doom)?”

Edit2: yah Frontpage! Totally for the wrong reasons though

Edit3: added ”, with ~1024 times the compute,” to “In 5 years AI will be superhuman at most tasks including designing AI”