Wei Dai comments on Relitigating the Race to Build Friendly AI

Wei Dai 4 Dec 2025 9:44 UTC
LW: 5 AF: 3
3
AF

If hypothetically you had a real solution, and triple quadruple checked everything, and did a sane and moral process to work out governance, then I think I’d want the plan to be executed, including “burn all the GPUs” or similar.

First note that the context of my old debate was MIRI’s plan to build a Friendly (sovereign) AI, not the later “burn all the GPUs” Task AI plan. If I was debating the Task AI plan, I’d probably emphasize the “roll your own metaethics” aspect a bit less (although even the Task AI would still have philosophical dependencies like decision theory), and emphasize more that there aren’t good candidate tasks to for the AI to do. E.g. “burn all the GPUs” wouldn’t work because the AI race would just restart the day after with everyone building new GPUs. (This is not Eliezer’s actual task for the Task AI, but I don’t remember his rationale for keeping the actual task secret so I don’t know if I can talk about it here. I think the actual task has similar problems though.)

My other counterarguments all apply as written, so I’m confused that you seem to have entirely ignored them. I guess I’ll reiterate some of them here:

What’s a sane and moral process to work out governance? Did anyone write something down? It seems implausible to me, given other aspects of the plan (i.e., speed and secrecy). If one’s standard for “sane and moral” is something like the current Statement on Superintelligence, then it just seems impossible.

“Triple quadruple checked everything” can’t be trusted when you’re a small team aiming for speed and secrecy. There are instances where widely deployed supposedly “provably secure” cryptographic algorithms and protocols (with proofs published and reviewable by the entire research community, who have clear incentives to find and publish any flaws) years later turned out to be actually insecure because some implicit or explicit assumption used by the proof (e.g., about what the attacker is allowed to do) turned out to be wrong. And that’s a much better understood, inherently simpler problem that has been studied for decades, with public adversarial review processes that much better mitigate human biases compared to a closed small team.

See also items 2 and 5 in my OP.

and not be overconfident

I didn’t talk about this in the OP (due to potentially distracting from other more important points) but I think Eliezer at least was/is clearly overconfident, judging from a number of observations including his confidence in his philosophical positions. (And overconfidence is just quite hard to avoid in general.) We’re lucky in a way that his ideas for building FAI or a safe Task AI didn’t almost work out, but instead fell wide of the mark, otherwise I think MIRI itself had a high chance of destroying the world.