Considered as a logical fallacy, it’s a form of hasty generalization.
But it’s interesting to call out specifically the free-associative step; the attempt to brainstorm possibilities, nominally with the intent of checking them. That’s a step where we can catch ourselves and say, “Hey, I’m doing that thing. I should be careful that I actually know what territory I’m mapping before I try to exhaustively map it out.”
Going to the AI Doom example:
A bunch of superforecasters were asked what their probability of an AI killing everyone was. They listed out the main ways in which an AI could kill everyone (pandemic, nuclear war, chemical weapons) and decided none of those would be particularly likely to work, for everyone. They ended up giving some ridiculously low figure, I think it was less than one percent. Their exhaustive free association did not successfully find options like “An AI takes control of the entire supply chain, and kills us by heating the atmosphere to 150 C as a by-product of massive industrial activity.”
The point at which reasoning has gone astray is in what sort of possibilities are being listed-out. The set {pandemic, nuclear war, chemical weapons} seems to be drawn from the set of known natural and man-made disasters: things that have already killed a lot of people in the past.
But the set that should be listed-out is actually all conditions that humans depend on to live. This includes a moderate-temperature atmosphere, land to grow food on, and so forth. Negating any one of those kills off the humans, regardless of whether it looks like one of those disasters we’ve survived before.
The forecaster is asking, “For each type of disaster, could runaway AI create a disaster of that type sufficient to kill all humans?” And they think of a bunch of types of disaster, decide that each wouldn’t be bad enough, and end up with a low P(doom).
But the question should really be, “For each material condition that humans depend on to live, could runaway AI alter that condition enough to make human life nonviable?” So instead of listing out types of known disaster, we list out everything humans depend on — like land, clean water, breathable atmosphere, and so on. Then for each one, we ask, “Is there some way an unaligned optimizer running on our planet could take this thing away?”
Can a runaway AI take all the farmland away? Well, how could a particular piece of land be taken away? Buy it. Build a datacenter on it. Now that land can’t be used for farming because there’s a big hot datacenter on it. Do people buy up land and build datacenters on it today? Sure, and sometimes to the annoyance of the neighbors. What do you need to do that? Money. Can AI agents earn money from trade with humans, or with other AI agents? They sure can. Could AI agents be sufficiently economically dominant that they can literally outbid humanity for ownership of all the land? Hmm… that’s not as easy as “could it cause a nuclear war big enough to kill everyone.”
So the advice here could be summed up as something like: When you notice that you’re brainstorming an exhaustive list of cases, first check that they’re cases of the right nature.
Considered as a logical fallacy, it’s a form of hasty generalization.
But it’s interesting to call out specifically the free-associative step; the attempt to brainstorm possibilities, nominally with the intent of checking them. That’s a step where we can catch ourselves and say, “Hey, I’m doing that thing. I should be careful that I actually know what territory I’m mapping before I try to exhaustively map it out.”
Going to the AI Doom example:
The point at which reasoning has gone astray is in what sort of possibilities are being listed-out. The set {pandemic, nuclear war, chemical weapons} seems to be drawn from the set of known natural and man-made disasters: things that have already killed a lot of people in the past.
But the set that should be listed-out is actually all conditions that humans depend on to live. This includes a moderate-temperature atmosphere, land to grow food on, and so forth. Negating any one of those kills off the humans, regardless of whether it looks like one of those disasters we’ve survived before.
The forecaster is asking, “For each type of disaster, could runaway AI create a disaster of that type sufficient to kill all humans?” And they think of a bunch of types of disaster, decide that each wouldn’t be bad enough, and end up with a low P(doom).
But the question should really be, “For each material condition that humans depend on to live, could runaway AI alter that condition enough to make human life nonviable?” So instead of listing out types of known disaster, we list out everything humans depend on — like land, clean water, breathable atmosphere, and so on. Then for each one, we ask, “Is there some way an unaligned optimizer running on our planet could take this thing away?”
Can a runaway AI take all the farmland away? Well, how could a particular piece of land be taken away? Buy it. Build a datacenter on it. Now that land can’t be used for farming because there’s a big hot datacenter on it. Do people buy up land and build datacenters on it today? Sure, and sometimes to the annoyance of the neighbors. What do you need to do that? Money. Can AI agents earn money from trade with humans, or with other AI agents? They sure can. Could AI agents be sufficiently economically dominant that they can literally outbid humanity for ownership of all the land? Hmm… that’s not as easy as “could it cause a nuclear war big enough to kill everyone.”
So the advice here could be summed up as something like: When you notice that you’re brainstorming an exhaustive list of cases, first check that they’re cases of the right nature.