If there was one thing that I could change in this essay, it would be to clearly outline that the existence of nanotechnology advanced enough to do things like melt GPUs isn’t necessary even if it is sufficient for achieving singleton status and taking humanity off the field as a meaningful player.
Whenever I see people fixate on critiquing that particular point, I need to step in and point out that merely existing tools and weapons (is there a distinction?) suffice for a Superintelligence to be able to kill the vast majority of humans and reduce our threat to it to negligible levels. Be that wresting control of nuclear arsenals to initiate MAD or simply extrapolating on gain-of-function research to produce extremely virulent yet lethal pathogens that can’t be defeated before the majority of humans are infected, such options leave a small minority of humans alive to cower in the wreckage until the biosphere is later dismantled.
That’s orthogonal to the issue of whether such nanotechnology is achievable for a Superintelligent AGI, it merely reduces the inferential distance the message has to be conveyed as it doesn’t demand familiarity with Drexler.
(Advanced biotechnology already is nanotechnology, but the point is that no stunning capabilities need to be unlocked for an unboxed AI to become immediately lethal)
The counter-concern is that if humanity can’t talk about things that sound like sci-fi, then we just die. We’re inventing AGI, whose big core characteristic is ‘a technology that enables future technologies’. We need to somehow become able to start actually talking about AGI.
One strategy would be ‘open with the normal-sounding stuff, then introduce increasingly weird stuff only when people are super bought into the normal stuff’. Some problems with this:
A large chunk of current discussion and research happens in public; if it had to happen in private because it isn’t optimized for looking normal, a lot of it wouldn’t happen at all.
More generally: AGI discourse isn’t an obstacle course or a curriculum, such that we can control the order of ideas and strictly segregate the newbies from the old guard. Blog posts, research papers, social media exchanges, etc. freely circulate among people of all varieties.
It’s a dishonest/manipulative sort of strategy — which makes it ethically questionable, is liable to fuel other trust-degrading behavior in the community, and is liable to drive away people with higher discourse standards.
A lot of the core arguments and hazards have no ‘normal-sounding’ equivalent. To sound normal, you have to skip those considerations altogether, or swap them out for much weaker arguments.
In exchange for attracting more people who are allergic to anything that sounds ‘sci-fi’, you lose people who are happy to speak to the substance of ideas even when they sound weird; and you lose sharp people who can tell that your arguments are relatively weak and PR-spun, but would have joined the conversation if the arguments and reasoning on display had been crisper and more obviously candid.
Another strategy would be ‘keep the field normal now, then turn weird later’. But how do you make a growing research field pivot? What’s the trigger? Why should we expect this to work, as opposed to just permanently diluting the field with false beliefs, dishonest norms, and low-relevance work?
My perception is that a large amount of work to date has gone into trying to soften and spin ideas so that they sound less weird or “sci-fi”; whereas relatively little work has gone into candidly stating beliefs, acknowledging that this stuff is weird, and clearly stating why you think it’s true anyway.
I don’t expect the latter strategy to work in all cases, but I do think it would be an overall better strategy, both in terms of ‘recruiting more of the people likeliest to solve the alignment problem’, and in terms of having fewer toxic effects on norms and trust within the field. Just being able to believe what people say is a very valuable thing in a position like ours.
Fair point, and one worth making in the course of talking about sci-fi sounding things! I’m not asking anyone to represent their beliefs dishonestly, but rather introduce them gently. I’m personally not an expert, but I’m not convinced of the viability of nanotech, so if it’s not necessary (rather it’s sufficient) to the argument, it seems prudent to stick to more clearly plausible pathways to takeover as demonstrations of sufficiency, while still maintaining that weirder sounding stuff is something one ought to expect when dealing with something much smarter than you.
If you’re trying to persuade smart programmers who are somewhat wary of sci-fi stuff, and you think nanotech is likely to play a major role in AGI strategy, but you think it isn’t strictly necessary for the current argument you’re making, then my default advice would be:
Be friendly and patient; get curious about the other person’s perspective, and ask questions to try to understand where they’re coming from; and put effort into showing your work and providing indicators that you’re a reasonable sort of person.
Wear your weird beliefs on your sleeve; be open about them, and if you want to acknowledge that they sound weird, feel free to do so. At least mention nanotech, even if you choose not to focus on it because it’s not strictly necessary for the argument at hand, it comes with a larger inferential gap, etc.
I’m retreating from my previous argument a bit. The AGI doesn’t need to cause literal human extinction with a virus; if it can cause enough damage to collapse human industrial civilization (while being able to survive said collapse) then that would also achieve most of the AGI’s goal of being able to do what it wants without humans stopping it. Naturally occurring pathogens from Europe devastated Native American populations after Columbus; throw a bunch of bad enough novel viruses at us at once and you probably could knock humanity back to the metaphorical Stone Age.
I find that more plausible. Also horrifying and worth fighting against, but not what EY is saying
Note that EY is saying “there exists a real plan that is at least as dangerous as this one”; if you think there is such a plan, then you can agree with the conclusion, even if you don’t agree with his example. [There is an epistemic risk here, if everyone mistakenly believes that a different doomsday plan is possible when someone else knows why that specific plan won’t work, and so if everyone pooled all their knowledge they could know that none of the plans will work. But I’m moderately confident we’re instead in a world with enough vulnerabilities that broadcasting them makes things worse instead of better.]
Use superintelligence to construct a functional model of human biochemistry
Design a virus that exploits human biochemstry
Use one of the currently available biochemistry-as-a-service providers to produce a sample that incubates the virus and then escapes their safety procedures (e.g. pay someone to mix two vials sent to them in the mail. The aerosols from the mixing infect them)
Fine, no problems here. Up to certain level of accuracy I guess
Acquire human DNA sample
Ok. Easy
Use superintelligence to construct a functional model of human biochemistry
By this, I can deduce different things. One, that you assume that this is possible from points one and two. This is nonsense. There are millions of things that are not written in the DNA. Also, you don’t need to acquire a human DNA sample, you just download a fasta file. But, to steelman your argument, let’s say that the superintelligence builds a model of human biochemistry not based on the a human DNA sample but based on the corpus of biochemistry research, which is something that I find plausible. Up to certain level!!! I don’t think that such a model would be flawless or even good enough, but fine
Design a virus that exploits human biochemstry
Here I start having problems believing the argument. Not everything can be computed using simulations guys. The margin of error can be huge. Would you believe in a superintelligence capable of predicting the weather 10 years in advance? If not, what makes you think that creating a virus is an easier problem?
Use one of the currently available biochemistry-as-a-service providers to produce a sample that incubates the virus and then escapes their safety procedures (e.g. pay someone to mix two vials sent to them in the mail. The aerosols from the mixing infect them)
Even if you succeed at this, and there hundreds of alarms that could go off in the meantime, how do you guarantee that the virus kills everyone?
Here I start having problems believing the argument. Not everything can be computed using simulations guys. The margin of error can be huge. Would you believe in a superintelligence capable of predicting the weather 10 years in advance? If not, what makes you think that creating a virus is an easier problem?
Because viruses already exist, and unlike the weather, the effect of a virus on a human body isn’t sensitive to initial conditions the way the weather, a three-body gravitational system, or a double pendulum is. Furthermore, humans have already genetically engineered existing viruses to do things that we want them to do...
how do you guarantee that the virus kills everyone?
You don’t really have to. Killing 19 out of every 20 people in the world would probably work just as well for ensuring the survivors can’t do anything about whatever it is that you want to do.
Would you say that a superintelligence would be capable of predicting the omicron variant from the alpha strain? Are you saying that the evolution of the complex system resulting from the interaction between the virus and the human population is easier to compute than a three body gravitational system? I am not denying that we can create a virus, I am denying that someone or something can create a virus that kills all humans and that the evolution of the system can be known in advance
I see your point. Humans tried to cull the population of (accidentally introduced) rabbits in Australia by using a natural virus that was highly lethal to them; the virus mutated to be less lethal and the rabbit population rebounded.
Yes, I can imagine many things. I can also imagine all molecules in a glass of water bouncing off in a way that suddenly the water freezes. I don’t see how a superintelligence makes that happen. This is the biggest mistake that EY is making. He is equating enormous ability to almightiness. They are different. I think that pulling off what you suggest is beyond what a superintelligence can do
Security mindset suggests that it’s more useful to think of ways in which something might go wrong, rather than ways in which it might not.
So rather than poking holes into suggestions (by humans, who are not superintelligent) for how a superintelligence could achieve some big goal like wiping out humanity, I expect you’d benefit much more from doing the following thought experiment:
Imagine yourself to be 1000x smarter, 1000x quicker at thinking and learning, with Internet access but no physical body. (I expect you could also trivially add “access to tons of money” from discovering a security exploit in a cryptocurrency or something.) How could you take over the world / wipe out humanity, from that position? What’s the best plan you can come up with? How high is its likelihood of success? Etc.
I agree that it can be more useful but this is not what is being discussed or what I am criticizing. I never said that AGI won’t be dangerous nor that it is not important to work on this. What I am a bit worried about is that this community is getting something wrong, namely, that an AGI will exterminate the human race and it will happen soon. Realism and objectivity should be preserved at all cost. Having a totally unrealistic take in the real hazards will cause backlash eventually: think of the many groups that’s defended that to better fight climate change we need to consider the worst case scenario, that we need to exaggerate and scare people. I feel the LW community is falling into this.
I understand your worry, but I was addressing your specific point that “I think that pulling off what you suggest is beyond what a superintelligence can do”.
There are people who have reasonable arguments against various claims of the AI x-risk community, but I’m extremely skeptical of this claim. To me it suggests a failure of imagination, hence my suggested thought experiment.
I see. I agree that it might be a failure of imagination, but if it is, why do you consider that way more likely than the alternative “it is not that easy to do something like that even being very clever”? The problem I have is that all doom scenarios that I see discussed are so utterly unrealistic (e.g. the AGI suddenly makes nanobots and delivers it to all humans at once and so on) that it makes me think that the fact we are failing at conceiving plans that could succeed is because it might be harder than we think.
There would also be a fraction of the human beings who would probably be inmune. How does the superintelligence solve that? Can it also know the full diversity how human inmune systems?
I agree with you broader point that a superintelligence could design incredibly lethal, highly communicable diseases. However, I’d note that it’s only symptomatic untreated rabies that has a survival rate of zero. It’s entirely possible (even likely) to be bitten by a rabid animal and not contract rabies.
Many factors influence your odds of developing symptomatic rabies, including bite location, bite depth and pathogen load of the biting animal. The effects of pathogen inoculations are actually quite dependent on initial conditions. Presumably, the innoculum in non-transmitting bites is greater than zero, so it is actually possible for the immune system to fight off a rabies infection. It’s just that, conditional on having failed to do so at the start of infection, the odds of doing so afterwards are tiny.
You’re actually right about rabies; I found things saying that about 14% of dogs survive and a group of unvaccinated people who had rabies antibodies but never had symptoms.
How do you guarantee that all humans get exposed to a significant dosage before they start reacting? How do you guarantee that there are full populations (maybe in places with a large genetic diversity like India or Africa) that happen to be inmune?
Just want to preemptively flag that in the EA biosecurity community we follow a general norm against brainstorming novel ways to cause harm with biology. Basic reasoning is that succeeding in this task ≈ generating info hazards.
Abstractly postulating a hypothetical virus with high virulence + transmissibility and a long latent period can be useful for facilitating thinking, but brainstorming the specifics of how to actually accomplish this—as some folks in these and some nearby comments are trending in the direction of starting to do—poses risks that exceed the likely benefits.
Happy to discuss further if interested, feel free to DM me.
If there was one thing that I could change in this essay, it would be to clearly outline that the existence of nanotechnology advanced enough to do things like melt GPUs isn’t necessary even if it is sufficient for achieving singleton status and taking humanity off the field as a meaningful player.
Whenever I see people fixate on critiquing that particular point, I need to step in and point out that merely existing tools and weapons (is there a distinction?) suffice for a Superintelligence to be able to kill the vast majority of humans and reduce our threat to it to negligible levels. Be that wresting control of nuclear arsenals to initiate MAD or simply extrapolating on gain-of-function research to produce extremely virulent yet lethal pathogens that can’t be defeated before the majority of humans are infected, such options leave a small minority of humans alive to cower in the wreckage until the biosphere is later dismantled.
That’s orthogonal to the issue of whether such nanotechnology is achievable for a Superintelligent AGI, it merely reduces the inferential distance the message has to be conveyed as it doesn’t demand familiarity with Drexler.
(Advanced biotechnology already is nanotechnology, but the point is that no stunning capabilities need to be unlocked for an unboxed AI to become immediately lethal)
Right, alignment advocates really underestimate the degree to which talking about sci-fi sounding tech is a sticking point for people
The counter-concern is that if humanity can’t talk about things that sound like sci-fi, then we just die. We’re inventing AGI, whose big core characteristic is ‘a technology that enables future technologies’. We need to somehow become able to start actually talking about AGI.
One strategy would be ‘open with the normal-sounding stuff, then introduce increasingly weird stuff only when people are super bought into the normal stuff’. Some problems with this:
A large chunk of current discussion and research happens in public; if it had to happen in private because it isn’t optimized for looking normal, a lot of it wouldn’t happen at all.
More generally: AGI discourse isn’t an obstacle course or a curriculum, such that we can control the order of ideas and strictly segregate the newbies from the old guard. Blog posts, research papers, social media exchanges, etc. freely circulate among people of all varieties.
It’s a dishonest/manipulative sort of strategy — which makes it ethically questionable, is liable to fuel other trust-degrading behavior in the community, and is liable to drive away people with higher discourse standards.
A lot of the core arguments and hazards have no ‘normal-sounding’ equivalent. To sound normal, you have to skip those considerations altogether, or swap them out for much weaker arguments.
In exchange for attracting more people who are allergic to anything that sounds ‘sci-fi’, you lose people who are happy to speak to the substance of ideas even when they sound weird; and you lose sharp people who can tell that your arguments are relatively weak and PR-spun, but would have joined the conversation if the arguments and reasoning on display had been crisper and more obviously candid.
Another strategy would be ‘keep the field normal now, then turn weird later’. But how do you make a growing research field pivot? What’s the trigger? Why should we expect this to work, as opposed to just permanently diluting the field with false beliefs, dishonest norms, and low-relevance work?
My perception is that a large amount of work to date has gone into trying to soften and spin ideas so that they sound less weird or “sci-fi”; whereas relatively little work has gone into candidly stating beliefs, acknowledging that this stuff is weird, and clearly stating why you think it’s true anyway.
I don’t expect the latter strategy to work in all cases, but I do think it would be an overall better strategy, both in terms of ‘recruiting more of the people likeliest to solve the alignment problem’, and in terms of having fewer toxic effects on norms and trust within the field. Just being able to believe what people say is a very valuable thing in a position like ours.
Fair point, and one worth making in the course of talking about sci-fi sounding things! I’m not asking anyone to represent their beliefs dishonestly, but rather introduce them gently. I’m personally not an expert, but I’m not convinced of the viability of nanotech, so if it’s not necessary (rather it’s sufficient) to the argument, it seems prudent to stick to more clearly plausible pathways to takeover as demonstrations of sufficiency, while still maintaining that weirder sounding stuff is something one ought to expect when dealing with something much smarter than you.
If you’re trying to persuade smart programmers who are somewhat wary of sci-fi stuff, and you think nanotech is likely to play a major role in AGI strategy, but you think it isn’t strictly necessary for the current argument you’re making, then my default advice would be:
Be friendly and patient; get curious about the other person’s perspective, and ask questions to try to understand where they’re coming from; and put effort into showing your work and providing indicators that you’re a reasonable sort of person.
Wear your weird beliefs on your sleeve; be open about them, and if you want to acknowledge that they sound weird, feel free to do so. At least mention nanotech, even if you choose not to focus on it because it’s not strictly necessary for the argument at hand, it comes with a larger inferential gap, etc.
I think that even this scenario is implausible. I have the impression we are overestimating how easy is to wipe all humans quickly
I’m retreating from my previous argument a bit. The AGI doesn’t need to cause literal human extinction with a virus; if it can cause enough damage to collapse human industrial civilization (while being able to survive said collapse) then that would also achieve most of the AGI’s goal of being able to do what it wants without humans stopping it. Naturally occurring pathogens from Europe devastated Native American populations after Columbus; throw a bunch of bad enough novel viruses at us at once and you probably could knock humanity back to the metaphorical Stone Age.
I find that more plausible. Also horrifying and worth fighting against, but not what EY is saying
Note that EY is saying “there exists a real plan that is at least as dangerous as this one”; if you think there is such a plan, then you can agree with the conclusion, even if you don’t agree with his example. [There is an epistemic risk here, if everyone mistakenly believes that a different doomsday plan is possible when someone else knows why that specific plan won’t work, and so if everyone pooled all their knowledge they could know that none of the plans will work. But I’m moderately confident we’re instead in a world with enough vulnerabilities that broadcasting them makes things worse instead of better.]
Yes, I can imagine that. How does a superintelligence get one?
Solve protein folding problem
Acquire human DNA sample
Use superintelligence to construct a functional model of human biochemistry
Design a virus that exploits human biochemstry
Use one of the currently available biochemistry-as-a-service providers to produce a sample that incubates the virus and then escapes their safety procedures (e.g. pay someone to mix two vials sent to them in the mail. The aerosols from the mixing infect them)
Solve protein folding problem
Fine, no problems here. Up to certain level of accuracy I guess
Acquire human DNA sample
Ok. Easy
Use superintelligence to construct a functional model of human biochemistry
By this, I can deduce different things. One, that you assume that this is possible from points one and two. This is nonsense. There are millions of things that are not written in the DNA. Also, you don’t need to acquire a human DNA sample, you just download a fasta file. But, to steelman your argument, let’s say that the superintelligence builds a model of human biochemistry not based on the a human DNA sample but based on the corpus of biochemistry research, which is something that I find plausible. Up to certain level!!! I don’t think that such a model would be flawless or even good enough, but fine
Design a virus that exploits human biochemstry
Here I start having problems believing the argument. Not everything can be computed using simulations guys. The margin of error can be huge. Would you believe in a superintelligence capable of predicting the weather 10 years in advance? If not, what makes you think that creating a virus is an easier problem?
Use one of the currently available biochemistry-as-a-service providers to produce a sample that incubates the virus and then escapes their safety procedures (e.g. pay someone to mix two vials sent to them in the mail. The aerosols from the mixing infect them)
Even if you succeed at this, and there hundreds of alarms that could go off in the meantime, how do you guarantee that the virus kills everyone?
I am totally unconvinced by this argument
Because viruses already exist, and unlike the weather, the effect of a virus on a human body isn’t sensitive to initial conditions the way the weather, a three-body gravitational system, or a double pendulum is. Furthermore, humans have already genetically engineered existing viruses to do things that we want them to do...
You don’t really have to. Killing 19 out of every 20 people in the world would probably work just as well for ensuring the survivors can’t do anything about whatever it is that you want to do.
Would you say that a superintelligence would be capable of predicting the omicron variant from the alpha strain? Are you saying that the evolution of the complex system resulting from the interaction between the virus and the human population is easier to compute than a three body gravitational system? I am not denying that we can create a virus, I am denying that someone or something can create a virus that kills all humans and that the evolution of the system can be known in advance
I see your point. Humans tried to cull the population of (accidentally introduced) rabbits in Australia by using a natural virus that was highly lethal to them; the virus mutated to be less lethal and the rabbit population rebounded.
Also, a virus like does would cause a great harm, but wouldn’t wipe humanity
Yes, I can imagine many things. I can also imagine all molecules in a glass of water bouncing off in a way that suddenly the water freezes. I don’t see how a superintelligence makes that happen. This is the biggest mistake that EY is making. He is equating enormous ability to almightiness. They are different. I think that pulling off what you suggest is beyond what a superintelligence can do
Security mindset suggests that it’s more useful to think of ways in which something might go wrong, rather than ways in which it might not.
So rather than poking holes into suggestions (by humans, who are not superintelligent) for how a superintelligence could achieve some big goal like wiping out humanity, I expect you’d benefit much more from doing the following thought experiment:
Imagine yourself to be 1000x smarter, 1000x quicker at thinking and learning, with Internet access but no physical body. (I expect you could also trivially add “access to tons of money” from discovering a security exploit in a cryptocurrency or something.) How could you take over the world / wipe out humanity, from that position? What’s the best plan you can come up with? How high is its likelihood of success? Etc.
I agree that it can be more useful but this is not what is being discussed or what I am criticizing. I never said that AGI won’t be dangerous nor that it is not important to work on this. What I am a bit worried about is that this community is getting something wrong, namely, that an AGI will exterminate the human race and it will happen soon. Realism and objectivity should be preserved at all cost. Having a totally unrealistic take in the real hazards will cause backlash eventually: think of the many groups that’s defended that to better fight climate change we need to consider the worst case scenario, that we need to exaggerate and scare people. I feel the LW community is falling into this.
I understand your worry, but I was addressing your specific point that “I think that pulling off what you suggest is beyond what a superintelligence can do”.
There are people who have reasonable arguments against various claims of the AI x-risk community, but I’m extremely skeptical of this claim. To me it suggests a failure of imagination, hence my suggested thought experiment.
I see. I agree that it might be a failure of imagination, but if it is, why do you consider that way more likely than the alternative “it is not that easy to do something like that even being very clever”? The problem I have is that all doom scenarios that I see discussed are so utterly unrealistic (e.g. the AGI suddenly makes nanobots and delivers it to all humans at once and so on) that it makes me think that the fact we are failing at conceiving plans that could succeed is because it might be harder than we think.
There would also be a fraction of the human beings who would probably be inmune. How does the superintelligence solve that? Can it also know the full diversity how human inmune systems?
Untreated rabies has a survival rate of literally zero. It’s not inconceivable that another virus could be equally lethal.
(Edit: not literally zero, because not every exposure leads to symptoms, but surviving symptomatic rabies is incredibly rare.)
I agree with you broader point that a superintelligence could design incredibly lethal, highly communicable diseases. However, I’d note that it’s only symptomatic untreated rabies that has a survival rate of zero. It’s entirely possible (even likely) to be bitten by a rabid animal and not contract rabies.
Many factors influence your odds of developing symptomatic rabies, including bite location, bite depth and pathogen load of the biting animal. The effects of pathogen inoculations are actually quite dependent on initial conditions. Presumably, the innoculum in non-transmitting bites is greater than zero, so it is actually possible for the immune system to fight off a rabies infection. It’s just that, conditional on having failed to do so at the start of infection, the odds of doing so afterwards are tiny.
You’re actually right about rabies; I found things saying that about 14% of dogs survive and a group of unvaccinated people who had rabies antibodies but never had symptoms.
How do you guarantee that all humans get exposed to a significant dosage before they start reacting? How do you guarantee that there are full populations (maybe in places with a large genetic diversity like India or Africa) that happen to be inmune?
Just want to preemptively flag that in the EA biosecurity community we follow a general norm against brainstorming novel ways to cause harm with biology. Basic reasoning is that succeeding in this task ≈ generating info hazards.
Abstractly postulating a hypothetical virus with high virulence + transmissibility and a long latent period can be useful for facilitating thinking, but brainstorming the specifics of how to actually accomplish this—as some folks in these and some nearby comments are trending in the direction of starting to do—poses risks that exceed the likely benefits.
Happy to discuss further if interested, feel free to DM me.
Thanks for the heads-up, it makes sense