The Mom Test for AI Extinction Scenarios
(Also posted to my Substack; written as part of the Halfhaven virtual blogging camp.)
Let’s set aside the question of whether or not superintelligent AI would want to kill us, and just focus on the question of whether or not it could. This is a hard thing to convince people of, but lots of very smart people agree that it could. The Statement on AI Risk in 2023 stated simply:
Mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and nuclear war.
Since the statement in 2023, many others have given their reasons for why superintelligent AI would be dangerous. In the recently-published book If Anyone Builds It, Everyone Dies, the authors Eliezer Yudkowsky and Nate Soares lay out one possible AI extinction scenario, and say that going up against a superintelligent AI would be like going up against a chess grandmaster as a beginner. You don’t know in advance how you’re gonna lose, but you know you’re gonna lose.
Geoffrey Hinton, the “godfather of AI” who left Google to warn about AI risks, made a similar analogy, saying that in the face of superintelligent AI, humans would be like toddlers.
But imagining a superintelligent being smart enough to make you look like a toddler is not easy. To make the claims of danger more palpable, several AI extinction scenarios have been put forward.
In April 2025, the AI 2027 forecast scenario was released, detailing one possible story for how humanity could be wiped out by AI by around 2027. The scenario focuses on an AI arms race between the US and China, where both sides are willing to ignore safety concerns. The AI lies to and manipulates the people involved until the AI has built up enough robots that it doesn’t need people anymore, and it releases a bioweapon that kills everyone. (Note that for this discussion, we’re setting aside the plausibility of a extinction happening roughly around 2027, and just talking about whether it could happen at all.)
The extinction scenario posed months later in If Anyone Builds It, Everyone Dies is similar. The superintelligent AI copies itself onto remote servers, gaining money and influence without anyone noticing. It takes control of infrastructure, manipulating people to do its bidding until it’s sufficiently powerful that it doesn’t need them anymore. At that point, humanity is either eliminated, perhaps with a bioweapon, or simply allowed to perish as the advanced manufacturing of the AI generates enough waste heat to boil the oceans.
I was talking to my mom on the phone yesterday, and she’d never heard of AI extinction risk outside of movies, so I tried to explain it to her. I explained how we won’t know in advance how it would win, just like we don’t know in advance how Stockfish will beat a human player. But we know it would win. I gave her a quick little story of how AI might take control of the world. The story I told her was a lot like this:
Maybe the AI tries to hide the fact it wants to kill us at first. Maybe we realize the AI is dangerous, so we go to unplug it, but it’s already copied itself onto remote servers, who knows where. We find those servers and send soldiers to destroy them, but it’s already paid mercenaries with Bitcoin to defend itself while it copies itself onto even more servers. It’s getting smarter by the hour as it self-improves. We start bombing data centers and power grids, desperately trying to shut down all the servers. But our military systems are infiltrated by the AI. As any computer security expert will tell you, there’s no such thing as a completely secure computer. We have to transition to older equipment and give up on using the internet to coordinate. Infighting emerges as the AI manipulates us into attacking each other. Small drones start flying over cities, spraying them with viruses engineered to kill. People are dying left and right. It’s like the plague, but nobody survives. Humanity collapses, except for a small number of people permitted to live while the AI establishes the necessary robotics to be self-sufficient. Once it does, the remaining humans are killed. The end.
It’s not that different a scenario from the other ones, aside from the fact that it’s not rigorously detailed. In all three scenarios, the AI covertly tries to gain power, then once it’s powerful enough, it uses that power to destroy everyone. Game over. All three of the scenarios actually make the superintelligent AI a bit dumber than it could possibly be, just to make it seem like a close fight. Because “everybody on the face of the Earth suddenly falls over dead within the same second”[1] seems even less believable.
My mom didn’t buy it. “This is all sounding a bit crazy, Taylor,” she said to me. And she’s usually primed to believe whatever I say, because she knows I’m smart.
The problem is that these stories are not believable. True, maybe, but not easy to believe. They fail the “mom test”. Only hyper-logical nerds can believe arguments that sound like sci-fi.
Convincing normal people of the danger of AI is extremely important, and therefore coming up with some AI scenario that passes the “mom test” is critical. I don’t know how to do that exactly, but here are some things an AI doomsday scenario must take into account if it wants to pass the mom test:
“We can’t tell you how it would win, but we can tell that it would win” is not believable for most people. You might know you’re not a good fighter, but most people don’t really feel it until they get in the ring with a martial arts expert. Then they realize how helpless they are. Normal people will not feel helpless based only on a logical theory.
A convincing scenario cannot involve any bioweapons. Normal people just don’t know how vulnerable the human machine is. They think pandemics are just something that happens every 5-20 years, and don’t think about it besides that. They don’t think about the human body as a nano factory that’s vulnerable to targetted nano-attacks.
A scenario that passes the mom test will also not include any drones. Yes, even though drones are currently used in warfare. Drones are the future. Drones are toys. Futuristic toys don’t sound like a realistic threat.
A mom test scenario also shouldn’t involve any hacking. Regular people have no idea how insecure computer systems are. It’s basically safe to do online banking on a computer, which gives people the intuition that computers are mostly secure. Any story involving hacking violates that intuition.
Probably there shouldn’t be any robots either, especially not human-shaped ones. Though I’ll admit “it’ll be exactly like the Terminator” is a more believable scenario than all of the three scenarios above, because it only requires one mental leap: the thing they already know and understand going from “fiction” to “nonfiction”.
No recursive self-improvement. It sounds strange and it’s not necessary, since I think most normal people assume AI and computers are really smart already, and don’t need an explanation for superintelligence. My mom expressed no disbelief when I said we might soon create superintelligent AI.
No boiling oceans. The more conventional methods used, the more believable. “The godlike AI solves physics and taps directly into the Akashic record and erases humanity from ever having existed” or any kind of bizarre weirdness is not as believable as “the AI launches the world-ending nukes that we already have that are already primed to launch”. (Though any scenario with an obvious “why not just disable the nukes?” counterargument won’t be believable either.)
No manipulation of humans! My mom won’t believe a robot can control her like a marionette and make her do its bidding. “I just wouldn’t do what it says.” Nevermind that this is false and she would do what it says. It’s not believable to her, nor to most people. If your scenario needs the AI to use people, they should be paid with Bitcoin the AI stole or something, not psychologically persuaded against their usual nature.
You can probably imagine a few more “mom test” criteria along these lines. Anything that makes a normal person think “that’s weird” won’t be believable. Some of the existing scenarios meet some of these criteria, but none meet all of them.
I’ve eliminated a lot of things. What’s left? Conventional warfare, with AI pulling the strings? The AI building its own nuclear weapons? I’m not sure, but I don’t think most laypeople will be convinced of the danger of superintelligent AI until we can come up with a plausible extinction scenario that passes the mom test.
Maybe only hyper-logical nerds can believe arguments that sound like sci-fi, but your mom only has to believe you. The question is whether you are believable, or whether you’re “starting to sound a bit crazy, Taylor”.
That’s her sign to you that you need to show that you can appreciate how crazy it sounds and maintain your belief. Because it does sound a bit crazy. It’s quite a leap from demonstrated reality, and most of the time people are making such leaps they’re doing fiction/delusion and not actually calling things right in advance. The track record of people saying crazy shit and then insisting “It’s not crazy I swear!” isn’t good. If instead, you meet her where she’s at and admit “Yeah. I know. I wish it was”, it hits differently.
I can’t remember if I’ve talked to my mom about it, but if I had to talk to her about it, I’d probably say something like “You hear of the idea that AGI is going to be completely transformative, and will have the power to kill us all? Yeah, that’s likely real”, and she’d probably say something like “Oh.”. That’s basically how it went when I told her the world was about to change due to the upcoming pandemic. I didn’t “try to persuade her” by giving her arguments that she’s supposed to buy, let alone spinning stories about how a bat had a virus and then these researchers genetically modified it to better attack humans. I just told her “Here’s what I believe to be true”, so that she could prepare. I was open to why it was that I believed it, but the heavy lifting was done by the fact that I genuinely believed it and I came off more like I was trying to share information so that she could prepare than like I was trying to convince her of anything.
In your shoes, besides making sure to acknowledge her point that it sounds crazy, I’d do a lot of genuine curiosity about her perspective. Has she ever experienced something that sounded crazy as fuck, and then turned out to be real? Not as a rhetorical question, just trying to understand where she’s coming from. Is she aware of the massive impact drones are having in the war in Ukraine? Has she thought about what it felt like to be warned of the power of nuclear weapons before anyone had seen them demonstrated?
These aren’t “rhetorical questions”, asked as ways of disguising a push for “Then you should stop being so confident!” but as a genuine inquiry. Maybe she has experienced something “crazy” turning out to be real, and noticing will change her mind. Or maybe she hasn’t. Or maybe it seems different to her, and learning in what way it seems different will be relevant for continuing towards resolving the disagreement. Giving people the space to share and examine their perspective without pressure is what allows people to have the experiences that shift views. Maybe she hasn’t had the experience of running from a terminator drone, or being outsmarted at every turn, but you could give her that experience—by pointing out the shared starting point and asking her to imagine where that goes.
She’d still have to take you up on that invitation, of course. If I’m wrong about being able to convince my own mom in a single line, it’d be for this reason. Maybe the idea would freak her out so much that she would be motivated to not understand. I don’t think she would, but maybe. And if so, that’s a very different kind of problem that you deal with by making arguments which are “more believable”.
I’m going to try to summarize your perspective before giving mine. It seems to me you suggest the following:
I should acknowledge that it sounds crazy. Something like “I know it sounds crazy.”
I should affirm my sincere belief. Not necessarily trying to convince, but just telling her “this is what I believe”, and let the sincerity of the belief on its own be convincing, rather than trying to persuade.
I should be open to her perspective, investigating where any points of disagreement would arise, what experiences she’s had that would make this believable/not believable, etc. Without pressure.
Overall, you’re suggesting that rather than trying to use/improve the mechanistic sci-fi style persuasive arguments that convince people on LessWrong, I should take a more human and individual approach, investigating what sounds crazy to her and examining her perspective.
I agree, of course, with a lot of that. I suspect if you’d been on the line when I was actually talking on the phone to my mom about AI extinction risk, you’d have approved.
I wrote this article because I found that some of the things I brought up, like bioweapons, were met with initial skepticism, and that these parts of the AI extinction argument are not load-bearing. We don’t know how AI would kill everyone. Some things sound outlandish (boiling oceans) without actually adding anything to the argument.
I was able to manage her skepticism, but I think if I’d skipped talking about bioweapons, I would have triggered less skepticism in the first place. In fact, I think there’s probably some way I could have talked about the AI extinction argument that she didn’t think sounded crazy at all. If so, then the amount of exploring her perspective and so on I’d need to do would be dramatically reduced.
Rather than start with something that sounds crazy, then assure people it’s not and convince them one by one, if we can actually make it not sound crazy in the first place, that sounds valuable.
Actually, no. I wouldn’t suggest you should do any of that. What I’m saying is purely descriptive.
This may sound like a nit, but I promise this is central to my point
I’d be surprised.
Not that I’d expect to disapprove, I just don’t really think it’s my place to do either. I tend to approach such things from a perspective of “Are you getting the results you want? If so, great. If not, let’s examine why”.
The fact that you’re making this post suggests “not”. I could reassure you that I don’t think you did terribly, and I don’t, but at the end of the day what’s my hypothetical approval worth when it won’t change the results?
I get that this might sound crazy from where you stand, but I don’t actually see skepticism as a problem. I wouldn’t try to route around it, nor would I try to assure anyone of anything.
I don’t have to explore my mom’s perspective or assure her of anything when I say crazy sounding stuff, because “He gets how this sounds, and has good reasons for his beliefs” is baked in. The reason I said I’d be curious to explore your mom’s perspective is because of the “sounds crazy” objection, and the sense that “I know, right?” won’t cut it. If I already understand her perspective well enough to navigate it without hiccup, then I don’t need to explore it any more. I’m not going to plow forward if I anticipate that I’m going to be dismissed, so when that happens I know I’ve erred and need to reorient to the unexpected data. That’s where the curiosity comes from.
The question of “How am I not coming off as obviously sane?” is much more important to me than avoiding stretching people’s worldviews. Because when I come off as obviously sane, I can get away with a hell of a lot of stretching, and almost trivially. And when I don’t, trying to route around that and convince people by “strategically withholding the beliefs I have which I don’t see as believable” strikes me as fighting the current. Or, to switch metaphors, it’s like fretting over excess weight of your toothbrush because lighter cargo is always easier, before fully updating on the fact that there are pickup trucks available so nothing needs to be backpacked in.
Projection onto “shoulds” is always a lossy process and I hesitate to do it at all, but if I were to do a little to make things a little more concretely actionable at the risk of incurring projection errors, it’d come out something like...
Notice how incredibly far and easily one can stretch the worldviews of others, once the others are motivated to follow rather than object. Just notice, and let it sink in.
Notice how this scales. No one believes the earth is round because they understand the arguments. Few people doubt it, because the visibly sane people are all on one side.
Notice the “spurious” connection between epistemic rationality and effectiveness. Even when you’re sure you’re right, “Make sure I come off as unquestionably sane, or else wonder what I’m missing” forces epistemic hygiene and proper humility. Just in case. Which is always more likely than we like to think.
Notice whether or not you anticipate being able to have the effectiveness you yearn for by adopting this mode of operation. If not, turn first to understand exactly where it goes wrong, focusing on “How can I fix this?”, and noticing if your attention shifts toward justifying failure and dismissal—because the latter type of “answering why it’s not working” serves a very different purpose.
Things like “Acknowledge that I sound crazy when I sound crazy” and “Explore my moms perspective when I realize I don’t understand her perspective well enough” don’t need to be micromanaged, as they come naturally when we attend to the legitimacy of objections and insufficiency of our own understanding—and I have no doubt that you do them already in the situations that you recognize as calling for them. That’s why I wouldn’t “should” at that level.
You say you’re focused on epistemic rationality and humility and so on, but you also say I should be focused on coming across as sane, independent of the actual argument I’m putting forward. In the sense that I could convince someone the Earth was round or flat by simply coming across as someone who knows what I’m talking about, rather than by actually putting forward a good argument.
I’m comfortable with delaying certain arguments until later. Every schoolteacher does the same thing. But what you’re suggesting feels more like Dark Arts. You try to equate it with being more rational and questioning your own understanding and so on, but I’m not sure I buy that you’re not just advocating for deception.
yes.
this is the key. the mom has (incorrectly!) identified, on vibes, that the author is operating more from ideological capture than from justified true belief, and is moving to protect him from that capture. whether the ideology is convincing is beside the point. of course it’s convincing! so is communism.
even whether the ideology is “true” is irrelevant. the response is an allergy to arguments that seek to control. the argument may be true, and logical; nonetheless, its appearance in this instance is because the speaker thinks “if i say this, you have to do what i tell you.” many people will reject this outright, on vibes.
i think analogies to relatively well known intuitive everyday things, or historical events, are a good way to automatically establish some baseline level of plausibility, and also to reduce the chances of accidentally telling completely implausible stories. the core reason is basically that without tethering to objective things that actually happened in reality, it’s really easy to tell crazy stories about a wide range of possible conclusions.
for hacking, we can look at stuxnet as an example of how creative and powerful a cyberattack can be, or the 2024 crowdstrike failures as an example of how lots of computers can fail at the same time. for manipulation/deception, we can look at increasing political polarization in america due to social media, or politicians winning based on charisma and then betraying them once in office, or (for atheists) major world religions, or (for anyone mindkilled by politics) adherents of their dispreferred political party. most people might not have experienced humiliating defeat in chess, or experienced being an anthill on an active construction site, but perhaps they have personally experienced being politically outmaneuvered by a competitor at work, or being crushed beneath the heel of a soulless bureaucracy which, despite being composed of ensouled humans, would rather ruin people’s lives than be inconvenienced with dealing with exceptions.
I see two problems. First, no way my mom or anyone like her is familiar with stuxnet or anything like it. I could tell her about it, but she’d be taking my word for it and have no way to judge whether my extrapolation to AI made sense.
Second, I think almost nobody can admit to themselves a time when they were outmanoeuvred by someone else. Normal people are quick to rationalize failure. I didn’t get the promotion but my coworker did, well that’s because they’re a lying bitch, or because they had the unfair advantage of being friends with the boss, or actually I never really wanted it anyway.
“Well, AI will be the most lying bitch, and it will be friend with all bosses”
Put that on a solid magenta background and post to Facebook, and you’ve convinced every mom in the country I think.
idea to make a thing that happened believable:
- stuxnet has a wikipedia page that is easy to point to. Pointing to things as a reflex sometimes works.
- there are perhaps some videos/books that are aimed at the general public, are somewhat entertaining for the general public, and have a visible view count of >1M.
- Maybe it was covered by one of the news channels that the general public person acknowledges as existing and a valid source of real things.
I think for the “scary hacks” category, it is worth coming up with 3-5 very different illustrative cases, and looking for ways to connect them to other things the person (likely) thinks are real.
I think it is worth doing the same with pandemics. (for instance there was the black death, there was the spanish flu, there was covid, which (depending on the politics of the person) was acknowledged as having been engineered)
I think cases of “going hard” by human groups are worth knowing about.
I think if some such cases are very useful for making the case that ASI could win in a fight against humanity, it would be worth first getting really good at establishing and discussing many such cases in a fun and believable way, and then once you succeed at enough of them to establish their existence, you can talk about how an ASI could pull these human-pullable levers.
If I was talking to my mother about this, I would try to frame it in such a way that it fits within her experience.
So, it would probably focus on empty supermarkets, scarce medicines, poor integration/communication among government departments and agencies, fuel shortages, power blackouts, isolated communities etc.
As supply chains, infrastructure and governments are failing, due to poor control (no internet, no stored data, and hence poor judgement/awareness).
The thoughts about viruses, robots etc would come later, once we have all been divided into ineffective and weak communities.
There are plenty of examples to illustrate each of these in isolation, but quite terrifying to consider that they might start to happen with a few days/weeks of each other, and just get worse over time….
I don’t believe in lying about what I think will happen. I think nanotechnology is possible. I think the speed of experiments speeds up drastically as scale decreases, that though there are bottlenecks they are very wide ones. I think that when things start happening, I expect them to happen very fast. I am not going to lie about these things to convince my or anyone else’s mother.
I am not actually suggesting lying, only deciding which truths are worth telling and in what order. As a silly example, if you have an accident at work, you don’t spring up and yell, “everyone, I’ve shit my pants!” for the whole office to hear. Not every truth needs to be spoken aloud the second it pops into your head in the same order. I wouldn’t shy away from nanotechnology stuff if you’re talking details with your mom, but I wouldn’t put it in the elevator pitch.
“People will draw conclusions that harm me” and “people will draw conclusions that weaken my argument” are very different things. Yelling that you shit your pants is in the first category. Saying things that make people less likely to believe in AI danger is in the second.
Hiding information in the second category may help you win, but your goal is to find the truth, not to win regardless of truth. Prosecutors have to turn over exculpatory evidence, and there is a reason for this.
It’s not dishonesty or hiding the truth to explain something to someone in a way they might understand. Maybe I’m misunderstanding, but your argument seems like a general argument against persuasion, pedagogy, and any deliberate ordering of information at all.
You always hide information by choosing to present it later. You can’t simply dump everything you know about a subject at the same time. You have to choose what’s the most important thing for the person you’re speaking to to hear at that time.
We don’t start teaching physics to middle school kids by starting with quantum mechanics because it’s the “most true”, instead we start with the easiest to understand information (Newtonian mechanics stuff) that lays the groundwork for the later stuff. You’re not hiding quantum mechanics, you’re presenting the information in the best order to teach it. If you start with quantum mechanics, they’ll be scared away.
In the case of talking about AI with my mom, there wasn’t any kind of debate or truth-seeking process happening. It was pure exposition. My mom was completely uninformed on the subject and ill-equipped to contribute. I was sharing an idea with her I found interesting/concerning, and she was reacting. If I started talking about nanobots and boiling oceans, there wouldn’t have been a more fair debate because there was no debate. It would have given my mom the wrong impression about AI extinction risk (that it sounds crazy) and accomplished nothing of value. Giving someone the wrong impression, that’s what’s dishonest, even if your literal words were true.
“It sounds crazy” is a correct impression, by definition. I assume you mean “the wrong impression (that it is crazy)”.
But there’s a fine line between “I won’t mention this because people will get the wrong impression (that it’s crazy)” and “I won’t mention this because people will get the wrong impression (that it’s false)”. The former is a subset of the latter; are you going to do the latter and conceal all information that might call your ideas into doubt?
(One answer might be “well, I won’t conceal information that would lead to a legitimate disagreement based on unflawed facts and reasoning. Thinking I’m crazy is not such a disagreement”. But I see problems with this. If you believe in X, you by definition think that all disagreement with X is flawed, so this doesn’t restrict you at all.)
I would say don’t conceal any important information. But you don’t have to lead with information that sounds crazy. Maybe bioweapons don’t make it into the 1 minute elevator pitch, but can be explained in the 10 minute version, or during an ensuing back-and-forth. If bioweapons were somehow critical to the AI extinction argument I wouldn’t say this, but all the sci-fi stuff isn’t actually part of the core argument anyway.
For the secular mom in your life,
“You know that thing where we beat neanderthals and chimps head to head and are now the top of the food chain everywhere by having groups and coordination? AI that can control a roomba really fast, but is made by big tech for military use, will be able to beat us in a war and then leave us for dead. Pretty much I’m telling you the only thing that looks fictional to me in Terminator is the glowing eyes and the time travel. In reality, there’s no recovering from a military strategist AI taking over and driving the new autonomous tanks they’re probably building already into town.”
It really should be quite easy to convince someone but once you do it will seem insurmountable to them
Also people often don’t immediately admit to being convinced
I like what you’ve written and personally find it persuasive. It would be gobbledygook to my mother. I do wonder how much people’s comments in this thread reveal about their own mothers lol. My mom is working class, smart, but not intellectual at all. I think that’s actually most people and most moms, but some people on here are in some kind of bubble where even their moms might be a bit different from the usual, non-intellectual working class person.
But what if they misquote “armies are made of people” and assume AI will be as foolish as portrayed in movies? Or what if they believe AI cannot take over industry, making the loss of military power irreversible? Or what if they fall into the illusion that AI can only be used for military purposes, thinking they need only prevent it from controlling armies—thus overlooking the possibility of a soft takeover?
I’m weakly betting this has more to do with the genre or style you presented as.
I talked to my mom about it, and I’m not sure what she ended up exactly believing but like jimmy, it went pretty different, I think she ended up something like “not 100% sure what to believe but I believe my son believes and it seems at least reasonable.”
I think my dad ended up believing something like “I don’t really buy everything my son is saying” (more actively skeptical than my mom), but probably something like “there’s something real here, even if I think my son is wrong about some things.”
(In both cases I wasn’t trying to persuade them, so much as say ‘hey, I am your son and this is what’s real for me these days, and, I want you to know that’).
When I talked to my aunt, and cousin, I basically showed them the cover of “If Anyone Builds It”, and said “people right now are trying to build AI that is smarter than humans, and it seems like it’s working. This book is arguing that if they succeeds, it would end up killing everyone, for pretty similar reasons to why the last time something ended up smarter than the rest of the ecosystem (humans) it caused a lot of extinctions – we just didn’t care that much about other animals and steamroll over things.”
And my aunt and cousin were both just like “oh, huh. Yeah, that makes sense. That, uh, seems really worrying. I am worried now.”
I think leaning on the “humans have caused a lot of extinction, because we are smarter than the rest of the ecosystem and don’t really care about most species” is pretty straightforward with left-leaning types. I haven’t tried it with more right-leaning types.
I think a lot of people can just sorta sense “man, something is going on with AI that is kinda crazy and scary.”
I think it’s only with nerds that it makes sense to get into a lot of the argument depth. I think people have a (correct) immune reaction to things that sound like complicated arguments. But I think the basic argument for AI x-risk is pretty simple, and it’s only when people are sophisticated enough to have complicated objections that it’s particularly useful to get into the deeper arguments.
(Wherein I’d start with “okay, so, yeah there are a lot of reasonable objections, the core argument is pretty simple, and I think there are pretty good counterarguments to the objections I’ve heard. But, if you want to really get into it, it’ll get complicated, but, I’m down to get into the details if you want to talk through them”)
I didn’t actually struggle to convince my mom overall, I just noticed some specific things I said triggered transient skepticism in a way that wasn’t necessary, because I said things to her that were popular in AI circles but sound crazy to normal people. This post was supposed to be a warning to people that those things can sound crazy, and that maybe they’re best avoided.
I think everything you say about people having a sense something weird is happening with AI, and starting by sharing your perspective rather than trying to persuade, that’s all well put and I agree. And before bringing up anything that sounds crazy, priming them by saying “this next part is gonna sound crazy/complicated” or something like that is a good idea.
I’d be interested to know how many people have readily-recalled experiences of being totally outclassed. I’ve played games against people far my superior, been ‘skinned’ by very skilled football (soccer) players, same in tennis, and once or twice sparred with people who can actually fight. It’s pretty visceral and memorable. (I also often outclass noobs myself in some games and sports.) Maybe ‘have you ever played a game against someone who totally outclassed you?’ is a good interactive conversation-starter? One issue might be that many people resist the idea that there are levels beyond human—but here there are existence proofs to point to.
I don’t think it’s good to compromise on this one. We have existence proofs all over the place and it really is a major weakness. Same for drones. In 2023 I had success talking to civil servants in the UK and they took this threat seriously. (Civil servants aren’t just anyone, but they’re not usually technical or philosophical nerds.)
A point of comparison could be, ‘you know that political faction you hate? Well, people got persuaded into believing and/or supporting that nonsense by a combination of trickery, self-interest, and delusion.’ Although certainly persuasion has a ceiling above the human level, I expect most people can’t be puppeteered arbitrarily. Most likely a majority can be subdued or confused by FUD, a large fraction swayed by greed or coercion or emotional attachment, and a small number conned into approximately anything with high effort. It’s legit for people to think they’d have some level of resistance. But scenarios don’t need everyone (or even most people) to be swayable.
I think it’s legit to point to climate change broadly construed. That’s salient for many people (won’t be for all), and many scenarios involving automated processes dispassionately trampling humanity are continuous with extreme climate change. It’s habitat destruction, but on humans. ‘Increasingly automated and inhumane firms drive super climate change and everyone dies’ is both a true description of a plausible scenario and made out of salient, colloquially understandable pieces. For responses that people/government would step in, you can mention lobbying, mercenary/automated defence, regulatory capture, and amplification of all the existing means that insulate harmful industrial activities.
It seems like you’re trying to argue at the object level that these things are convincing, and I agree with you, but that won’t make my mom believe them. You can come up with convincing reasons why bioweapons are a huge threat or whatever, but if you use that to introduce the threat of AI, now that’s two things she doesn’t understand. It’ll either be a long conversation where you get to slowly explain everything, or she won’t buy it. I say, why rest the AI extinction argument on exotic ideas if it might not be necessary?
My comment was actually (perhaps unhelpfully) a series of somewhat independent comments. I’ll fork under here. In sum, you could say I’m arguing that the identified heuristics for ‘mom test’ aren’t necessarily well fit, in part by giving reasons and angles-on-reasons which are (in my experience) more effective than those implied by the discussion you give to justify the heuristics. I’m also offering a few angles which I’ve found to be useful conversation openers.
On climate change, I just think it’s a point worth making: people are getting exercised about very minor contributions to resource consumption by current AI firms, which is a bit silly, but it is continuous with the kinds of radical and extinctive activities which might stamp out humans forever!
On persuasion, it looks like you might have gone way too hard, if you’ve been arguing for arbitrary puppeteering out of the gate! (Though perhaps the quote in your mom’s voice is hypothetical?)
I’m also offering an example of a way in to discussion (again most workable one-to-one), pointing out the many and varied ways that humans persuade and coerce each other.
On bio, I’m disagreeing that the mom test criteria override the importance of emphasising important interlinked issues. It’s not necessary to mention bio in all conversations, but you shouldn’t shy away from it, and it’s a very available and reasonable example to turn to of the kinds of vulnerabilities that could wipe out huge populations or even all humans.
On ‘totally outclassed’, I was just offering some ways that in conversation you can make that point relatable. It’s generally far more workable one-to-one, since you’re having a conversation. Less likely to work in writing, though maybe.
I actually tried talking to my mom about this 6 months ago. As I elaborated further and further on all the reasons A-Z why we’re fucked, she saw that it was distressing me.
She told me, “If it’s bothering you, why think about it? You can choose to go think about something else.”
“You’re basically saying that I should stick my head in the sand like an ostrich and ignore it?”
“Yes. If we’re heading towards the end of the world, which I highly doubt, then you should spend the rest of your time doing what makes you happy.”
The idea of total extinction might seem wild for non-nerds. Maybe it is good to start with small things:
-the job you are doing will be done by AI
-whatever education you or your kids get in college, it won’t give you a job
-even if you are working on AI with AI you still can be replaced completely by AI
This may at least make them to think more about impact on society and importance of the problem on the gut level, and from there we could go to more serious issues
I would say the nuclear war would be the least sci-fi scenario. Arm race leads to using AI everywhere to beat the opponent, including systems responsible for observing and responding to opponent’s missile strikes, and then it goes rogue.
I think bioweapons can be persuasive. We know that there are viruses like smallpox with very high lethality and very big virulence. Actually COVID and smallpox would be a good starting point for explanation. I would say something like: “Remember COVID spread everywhere despite all restrictions? Its incubation period was roughly 2 weeks. For smallpox, it can be like 40 days. It is incredibly viral. In a modern world, in 40 days it will be everywhere, and then it is too late. Lethality rate 50-80%. And for smallpox we have a vaccine, that is why we are safe against a smallpox. AI already designs viruses and can easily be used to design something like smallpox, but for which this vaccine does not work”.
I don’t think most people think smallpox-level pandemics are believable. Sure, they happened in the past, but they couldn’t happen now with modern medicine, could they? Logic comparing incubation periods, whether true or false, is also the same kind of finicky reasoning that fails the mom test.
What do we have left? Nukes and chemical weapons for killing people… seems doable. How does the AI establish control over manufacturing chains, once shit is going down? Maybe self-driving lorries, trains, and automated factories?
These are more stringent rules than I normally use for myself: I often talk about biological weapons, hacking, and drones, for example. One main route I can see is to emphasize automation and mechanization.
(This actually very much is the terminator angle, the film makes it clear that most of the fighting is done by big tanks and helicopters, with the terminators only being used for infiltration. Skynet was originally “hooked into everything” because it was trusted.)
This just looks like gradual disempowerment. Maybe throw in some AI-controlled tanks if you think you can (I genuinely don’t know what the problem with using drones is, I think this is possibly an idiosyncracy on your mum’s part, since everyone knows about Obama’s drone strikes in the middle east)
In any case, convincing someone in a single five-minute conversation over the phone is a high bar; we should rise to this challenge, and above it.
I think convincing someone in a five-minute conversation is actually a ridiculous pipe-dream scenario we will not get for most people.
I remember back when I was in school at the University of Toronto, some people would talk at length about how evil Jordan Peterson was. The only problem was that they’d literally never heard the man speak and had no idea what his positions even were. They only knew about him from what they’d heard other people say about him. This is how I expect most people will learn about AI extinction risk. Most people will hear a butchered, watered down, strawman version of the AI extinction argument in a clip on Instagram or TikTok as they’re scrolling by, or hear a ridiculous caricature of the argument from a friend, followed by “isn’t that stupid? Don’t they know intelligence makes people more moral, so AI would be friendly?”
Almost nobody will actually hear the argument straight from someone who uses LessWrong or who is similarly “in the know”. Most of the nuance of the argument will be quickly lost, and what will be left will sound like, “did you know some idiots think AI will kill everyone by boiling the oceans?” In that case, having an argument that sounds implausible at first, but makes sense when you dig into the logic is way worse than having an argument that sounds plausible in the first place.
Of course, if someone is open to hearing a more detailed argument, that’s great. We don’t have to give up nuance, only we shouldn’t lead with it. Start with something that sounds plausible even to my mom, then be ready to back it up with all the nuance and logic for whoever wants more details.
I think you’re right that self-driving tanks and helicopters and stuff sound plausible. I guess drones don’t sound too bad if you’re using them to drop grenades. I think the start to sound sci-fi if you have them doing unusual things like spraying viruses over cities. They are kinda sci-fi coded in general though, I think. When it comes to the AI controlling manufacturing chains, I think robots are fine there. Or AI acquiring money and paying people. I just wouldn’t use robots with guns killing people, because that sounds like a movie.
I agree with this. A little bit of appeal to “greed” or “recklessness” or “fear of getting left behind” can be useful as well, since it provides a layer of telos to the whole thing. “Society undone by greed and fear” feels more natural “Society undone because the universe was just really mean with how hard AI alignment was.”
Likewise, putting human agency as the first thing in the story helps ground people. A lot of people have this belief (alief?) that humans have a magic first-mover spark which nothing else can produce (except sometimes random natural disasters I guess?).
Putting these together gets you “Humans are excitable and afraid, so of course we’d put AI in charge of a bunch of industrial and military processes.”
There’s also a framing jump which goes from “AI is a tool” to “AI kills us” which I’m currently working on. I want to pump an intuition that “tools” very occasionally let you push on parts of the world that you don’t understand, and that this is a really common way to get yourself killed. In this case, deep learning is a “tool” which pushes on intelligence itself, which we don’t understand at all. Loads of people have an intuition that “don’t play with things you don’t understand” is good advice. This is more aimed at certain middle-sophistication individuals (e.g. Bluesky types and TypeScript web devs) who are particularly resistant to existing ideas.
Hmm. I don’t think it’s “ridiculous” because I don’t have a solid upper-bound on how persuasive it’s possible for a person to be. I’d rather not rule out things like this, and just keep working on better pedagogy until it is done.
I was referring to the fact that we won’t even get the opportunity to deliver five minutes of information to most people, not that it couldn’t be convincing if you had that opportunity.
I think a lot of what you say makes sense. Framing AI as human folly seems more believable.
I think there is a model of “normies” employed here which is a decent first approximation, but isn’t precise enough. I think for instance that all three above things can be make real to normies. I feel like the “decent first approximation” model of normies here present an “impossible” (in video game sense) problem. But the real problem is merely hard.
Here are some diffs between my and your model of normies:
1) I think most normies can be convinced that things that actually happened happened, even if the things sound very sci-fi/weird. It is worth thinking how to do this well. It isn’t always easy, but this is a great sanity check, can you actually convince normies of things that already happened? (the answer is yes, you just have to become good at it) In particular, if you can’t convince most normies that drones are a major factor in the ukraine war, if you can’t convince most normies that stuxnet and various other hacking ops was super impressive and scary, if you can’t convince most normies that the black death killed a lot of people and disrupted society for years, then you are making mistakes that take like 1 month of deliberate effort to fix at most.
2) what presents naively as skepticism of sci-fi stuff is actually something else. Specifically, I think most people (including normies) have a default where everything they see “makes sense” to them, and if they reflect about it, they have some explanation for the evidence they see. If someone has a pretty bad explanation for some evidence, your argument will not make sense to them. If you identify these places where the explanation for the evidence is misleading, you can dig there and find a different example which will have a different explanation, or use some other method to make your argument flow more explicitly than if you were making it to someone who had shared models for the observation you present.
3) normies (and most non-normies, but that is a different post lol) don’t run on arguments. They have intuitions that activate when you present some idea and that determines whether they agree or not. People are bad at verbalizing what the intuition is that is triggering, and there isn’t a direct mapping between what the intuition is and what they say when it triggers. An example of this is someone being skeptical that AI would be lethal, asking “is elon gonna program it to kill all <insert ingroup>s?” and then I said “AI is not programmed line-by-line, we just tweak it through trial and error until it achieves some measurable capabilities” and then they were like “holy shit, now I get it”. It is not obvious (at all) a priori that “I am skeptical that AI could hurt us” was linked to the “AI is just programmed software” intuition. And normies differ immensely in which intuitions they have. My default guess is that if it appears to you like “you must never mention X to normies if you want to convince them”, then probably you are faced with some intuitions that you have misidentified.
4) normies are a social species and implement a social epistemology. It isn’t fully accurate to say that normies don’t change their minds in isolation, but it does capture some real thing, which is that there is huge variance in how their intuitions work and they mostly change their minds as a group. If you have some kind of convincing short story that they feel they could repeat and justify-themselves/convince others with, then that will do most of the work. If you give them that first, then fully convincing them through “logic and facts” may just go smoother. Don’t neglect the 5 minute version of what you are saying. the 5 minute version mostly needs to be plausible and defensible. Also, social proof is important, finding someone they respect to acknowledge a topic as valid does a lot of work.
I simply disagree that one must avoid relying on facts like: infectious disease can kill millions; devices can be hacked; people can be manipulated by an email or by a talking head on a screen. Hopefully most people wouldn’t actually dispute any of that, and would instead be objecting to other aspects of a proposed scenario, like the escalation of those real phenomena to the point of extinction, or the possibility of an AI smart enough and determinedly malevolent enough to carry out such an escalation.
I think most people would agree infectious disease could kill millions. But killing billions/everyone feels very different than that. Covid didn’t kill billions, so why would AI?
I think most people think hackers can break into someone’s Facebook account or scam someone, but they don’t think a hacker could plausibly hack us in ways like gaining access to nuclear weapons or shutting down infrastructure at scale. North Korea can’t destroy us with hacking, so why can AI?
I think most people could believe someone could be manipulated, but not easily manipulated like a puppet.
If you end up talking about infectious disease, hacking, or manipulation, the nerfed version people are willing to believe will make it sound like a fair fight when it isn’t. You could probably talk someone into believing the harder version of these things, but I wouldn’t put them in the 1 minute elevator pitch.
Here’s a different approach —
Don’t think about AI. Think about unethical optimizers. To start, think about unethical, harmful, or illegal things that companies have done for profit.
You know how companies sometimes do things that are bad for their workers or customers, like use dangerous chemicals unsafely and give people cancer? Or how people like Martin Shkreli sometimes buy up something that people need desperately, like a life-saving medicine, and jack up the price so that poor people can’t afford the thing they need to live? Or how in the 1700s in England, the landlords kicked the peasants off the land so they could raise sheep instead, and the peasants who couldn’t find a city job just starved and died? Or how in colonialism, the East India Companies would just enslave people and take over countries, all for the profit of investors back in London and Amsterdam? Or how in China a few years back, a baby-formula company put poisonous melamine in the baby formula, because melamine looks like protein to a chemical test and is cheaper than real protein, and killed and sickened thousands of babies?
In all these cases, people were trying to make a number go up — their profit — and they harmed people in doing it. They were optimizing something, and they did it in a way that hurt and even killed some of the humans they affected.
Okay, now imagine that the decisions in these cases were instead being made by something that isn’t even human, can’t ever get poisoned, never needs medicine, and doesn’t need to eat food — and can’t ever be put in prison or punished if they do a crime. Think of a manufacturing company run by AI, selling products to other companies run by AI, without any humans checking its work. What reason would it have to avoid poisoning its human neighbors with chemicals, or running them off the land, or making it impossible for them to get what they need to live?
When someone sets an AI agent in charge of running something, they give it some goal to accomplish. They want it to make money, so they give it some assets and tell it to make the profit numbers go up. They want it to run a widget factory efficiently, so they put it in charge of what materials are used and what the process is, and they tell it “make more widgets cheaper than the competition”. Or whatever. The point is, they give it some number that they want to make go up. And then it decides how to do that.
The AI literally cannot care about anything that it’s not programmed to care about. And right now, we don’t know how to give it rules that will stick. Our smartest engineers don’t know how to make AI that will effectively accomplish a goal while staying within safe limits on its behavior. And they’ve been trying! We have lots of experiments where the AI instead tries to sneak around, delete the safety tests, hide what it’s doing, so that it can accomplish its goal without all those pesky limits.
And AI can be faster, smarter, and sneakier than humans. It can do more things in a day than a human can. It can outwit the smartest humans, just like chess AI beating human grandmasters. It never sleeps or goes on vacation. And it can make itself even faster and smarter by buying more computers to run on.
Basically, misaligned AI running things is even worse than unethical, profit-driven rich people running things. Elon Musk at least knows he has to breathe air and drink water. (Also he sometimes sleeps or goes to parties or does other things besides business. And he has kids, and doesn’t want them to die horribly. Most of them, anyway.)
AI running things is like an immortal, immoral CEO who can’t go to prison, cannot be made to follow laws, and has no human loved-ones to care about the future for. All it cares about is “number go up” and nobody knows how to make it do that while following rules.
I thought about the angle of comparing AI to companies, which we also struggle to align. The problem is, we currently have companies mostly aligned kind of, so it doesn’t seem that hard. Like, companies do shady stuff, but all that stuff you mentioned about the East India Companies or China isn’t happening anymore/here. Here/now, we have laws that constrain companies enough that we can avoid that kind of bad action. So, problem solved, right? It fails to communicate the difficulty of making AI safe.
Wait what? Who believes companies are basically mostly aligned? Is that a belief >1/5 people hold in any large area or demographic? My impression is people either think they’re misaligned and this is all you can hope for, or they’re misaligned and you can hope to change that. I’m not familiar with a “companies are basically well managed” view from anyone, including at companies, where the employees also generally know their money comes mostly from using the actual value as bait to manipulate customers. The degree to which this belief has saturated culture is actually a bit surprising to me, I updated recently that almost nobody is like, just fine with corporate behavior, but anyone who sees it up close is just enjoying their defect payout too much to unilaterally stop, and tends to go “coordination is hard, let’s go shopping”. If I’m wrong about this it would be important news, but right now “corporations do enshittification, ai would do that even more because it’s even more sociopathic and money loving”
The hard part remains convincing people extreme capabilities are coming… The ones who feel outclassed by chatgpt may be easier to convince? Idk
Your average joe will grumble about how execs at a company are greedy and wish they were getting a raise this year, or moan about how Amazon isn’t paying taxes. However they do not believe companies are misaligned to the point they can get away with slavery, murder, etc. To the point where they are an actual threat to society itself. Companies are seen as leeches, not lions. That’s what I mean by “companies are basically well managed”. People trust they won’t be toppling the government, assassinating journalists, killing people who don’t buy enough of their products, and so on. They are constrained by the law, and society basically functions (here/now). If you make an analogy between companies and AI, people will assume the problem is hard, but basically manageable.
The first example here is the fossil fuel industry. They are a threat to society itself. This was obvious to me after looking at average monthly temperature data collected since the 1850’s. Of course (considered as monoliths) fossil fuel companies have “known” this since at least the 1950’s. Thus we can reasonably say that they are performing a deliberate slow motion murder of human civilisation (and, possibly, of all mammals bigger than a bread box).
Of course there are other examples. You write
The tobacco industry kills 7 million people every year. So it is more a matter of “killing people who buy too much of their products”.
Perhaps your counter-point will be “Negative externalities are hard to feel when they are slow and diffuse. So my mom will reject these.” My counter is to say “Negative externalities are easier to feel when the impact is constant and vast.” Examples: a relative with lung cancer, a region of your country that recently had a huge forest fire, and fecal matter in rivers.
Perhaps the jailbreaking issue is easier to grasp.
“All modern AIs can be jailbroken to act dangerously.
Nobody knows how to solve this.
When we get really powerful AI, bad guys can use it to kill millions, maybe billions”
I think this is probably easier to grasp, but it then makes it sound like all you have to do to stop the AI is stop some people, and stopping some people isn’t supernaturally difficult.
And what did the mom test LEAVE for the AIs? The mom test requires the AI to win in a society where robots don’t do the hard work. But what could do it? Remote-controlled factories and tools? Or actual humans? If it is the latter, then the AI is supposed only to disempower the humans instead of commiting genocide...
I agree I’m not leaving a lot on the table, but that’s just downstream of the fact that people won’t believe most things.
Robots are fine for the AI to use after humans are extinct, I just don’t think normal people will believe we’re gonna go extinct because of war-bots or drones hunting down humans. I was speaking only about using robots as part of the extinction event.
So we are left with the AI offering its advice to politicians and developing weapons and tech to use. But the AI could, say, do things like the ones in the Rogue Replication Scenario:
Or do something like my take where the AI establishes a colony on Mars and has the ability[1] to bomb the Earth with meteorites dropped on power plants and nukes causing Yellowstone to erupt.
Strictly speaking, my scenario doesn’t involve nuking a point at the Earth, it involves the AI who is aligned enough to care about the humans in the way differing from the Spec. But the AI could, for example, demonstrate its power by murdering the Oversight Committee with meteors.
These sound cool and interesting, but I’m pretty sure would not pass the mom test. They just sound too exotic compared to something like “China starts building war bots, but the AI takes control of them all” or something. That’s just an example. I know I said no robots, but that’s my weakest rule.
And what about just provoking WWIII and watching as someone nukes Yellowstone?
Here is at least one scenario that should pass the mom-test, even though it is just boring old cold war with AI:
I’m sure my mom would have found this more convincing, but it struggles from the fact that an obvious counter-argument exists. Instead of not building AGI at all, people can just say “Well, just don’t put AGI in control of the military then!”