Three Minimum Pivotal Acts Possible by Narrow AI

The Alignment problem appears extremely concerning, so here’s my attempt to explain the most likely way we could live in an aligned future.

The Perfect Tool AI

Imagine a Tool AI, something like DALLE-2 or GPT-3. It’s a simple AI relatively speaking, and it does one of three things exceptionally well:

-A Perfect Creator. Manufactures and controls nanotechnology to rearrange atoms in a given space. If you tell this tool to destroy an area, it will do so by using self-replicating nanotechnology to disassemble that area into basically whatever you want. You would basically have a perfect, unstoppable factory that can produce anything from anything. In this case, you can tell the AI to disassemble the components of every AI lab on earth, create human-life extending technologies, and you could live as an immortal god over humanity as you slowly put together a team of AI Alignment researchers to build an aligned AI over the centuries.

Likelihood of creation: Very low. Nanotech is a hard problem to solve, and controlling that nanotech is just as difficult.

-A Perfect (Computer) Virus. Imagine a worm like StuxNet, but it contains a simple script that connects it to a higher AI controller once it takes over a computer, and then creates new controllers once it has sufficient compute power. With each computer it infects, it gets smarter and more powerful, and most importantly, better at infecting computers. Human security teams cannot keep up. The Internet is basically an ecosystem, the same way Earth was before the explosion of cyanobacteria, and it is ripe for the taking by an advanced AI. Whoever is behind this virus can now control every computer on Earth. If they’re focused on AI Alignment, then they can easily destroy any computers connected to an AI Lab and slowly assemble a team of alignment engineers to solve AGI Alignment as well as allowing compute power for other tasks like life-extension technology.

Likelihood: Medium. I am honestly surprised we haven’t seen AI powered security breaches. It seems feasible.

-A Perfect Manipulator. GPT-3 is already surprisingly convincing, but what if you trained an AI to be perfectly convincing? I have seen incredibly charismatic people change another person’s political beliefs over the span of a conversation. There is some string of text that could likely convince you of just about anything, and if an AI gets good enough at that, you might literally be able to convince the whole world of the importance of AI Alignment.

Likelihood: Medium. It should be fairly simple, even if you were to train it to create highly-upvoted posts and then make the topic Alignment. I’m sure GPT-3 could make a thousand variations of this post and spread it across the internet, and another AI on top of that to find increasingly persuasive arguments should also be feasible.


If Eliezer is correct and AGI Alignment is nearly unsolvable in our timeframe, then we’re basically just going to have to move the alignment problem to a human with access to godlike technology and hope that they cripple any chances of creating other godlike Tool AIs.

Either way, awareness is the most critical piece of the puzzle at this juncture. Start posting about AI alignment. We have such a tiny fraction of the human population thinking about this problem, so we need more minds.

I’d like feedback on whether a single human operator with access to godlike AI would work to delay the advent of AGI until Alignment becomes possible. Could this be a solution, even if it is farfetched? After thinking extensively, it’s the only solution I can come up with. At the very least, I do not see what could make it unworkable. Even if absolute power corrupts (a human operator), I do not believe it will corrupt absolutely (resulting in the destruction of the human race).