To clarify my views on “will misaligned AIs that succeed in seizing all power have a reasonable chance of keeping (most/many) humans alive”:
I think this isn’t very decision relevant and is not that important. I think AI takeover kills the majority of humans in expectation due to both the takeover itself and killing humans after (as as side effect of industrial expansion, eating the biosphere, etc.) and there is a substantial chance of literal every-single-human-is-dead extinction conditional on AI takeover (30%?). Regardless it destroys most of the potential value of the long run future and I care mostly about this.
So at least for me it isn’t true that “this is really the key hope held by the world’s reassuring voices”. When I discuss how I think about AI risk, this mostly doesn’t come up and when it does I might say something like “AI takeover would probably kill most people and seems extremely bad overall”. Have you ever seen someone prominent pushing a case for “optimism” on the basis of causal trade with aliens / acaual trade?
The reason why I brought up this topic is because I think it’s bad to make incorrect or weak arguments:
I think smart people will (correctly) notice these arguments seem motivated or weak and then on the basis of this epistemic spot check dismiss the rest. In argumentation, avoiding overclaiming has a lot of rhetorical benefits. I was using “but will the AI actually kill everyone” as an example of this. I think the other main case is “before superintelligence, will we be able to get a bunch of help with alignment work?” but there are other examples.
Worse, bad arguments/content result in negative polarization of somewhat higher context people who might otherwise have been somewhat sympathetic or at least indifferent. This is especially costly from the perspective of getting AI company employees to care. I get that you don’t care (much) about AI company employees because you think that radical change is required for their to be any hope, but I think marginal increases in caring among AI company employees substantially reduce risk (though aren’t close to sufficient for the situation being at all reasonable/safe).
Confidently asserted bad arguments and things people strongly disagree make it harder for people to join a coalition. Like, from an integrity perspective, I would need to caveat saying I agree with the book even though I do agree with large chunks of the book and the extent to which I feel the need to caveat this could be reduced. IDK how much you should care about this, but insofar as you care about people like me joining some push you’re trying to make happen this sort of thing makes some difference.
I do think this line of argumentation makes the title literally wrong even if I thought the probability of AI takeover was much higher. I’m not sure how much to care about this, but I do think it randomly imposes a bunch of costs to brand things as “everyone dies” when a substantial fraction of the coalition you might want to work with disagrees and it isn’t a crux. Like, does the message punchyness outweight the costs here from your perspective? IDK.
Responding to some specific points:
a lot of my objection to superalignment type stuff is a combination of:
I agree that automating alignment with AIs is pretty likely to go very poorly due to incompetence. I think this could go either way and further effort on trying to make this go better is a pretty cost-effective (in terms of using our labor etc) to marginally reduce doom, though it isn’t going to result in a reasonable/safe situation.
that the real world doesn’t look like any past predictions people made when they argued it’ll all be okay because the future will handle things with dignity
To be clear, I don’t think things will be OK exactly nor do I expect that much dignity, though I think I do expect more dignity than you do. My perspective is more like “there seem like there are some pretty effective ways to reduce doom at the margin” than “we’ll be fine because XYZ”.
my response to the trade arguments as I understand them is here plus in the footnotes here
I don’t think this seriously engages with the argument, though due to this footnote, I retract “they don’t talk at all about trade arguments for keeping humans alive” (I edited my comment).
As far as this section, I agree that it’s totally fine to say “everybody dies” if it’s overwhelmingly more likely everyone dies. I don’t see how this responds to the argument that “it’s not overwhelming likely everyone dies because of acausal (and causal) trade”. I don’t know how important this is, but I also don’t know why you/Eliezer/MIRI feel like it’s so important to argue against this as opposed to saying something like: “AI takeover seems extremely bad and like it would at least kill billions of us. People disagree on exactly how likely vast numbers of humans dying as a result of AI takeover is, but we think it’s at least substantial due to XYZ”. Is it just because you want to use the “everybody dies” part of the title? Fair enough I guess...
If humans met aliens that wanted to be left alone, it seems to me that we sure would peer in and see if they were doing any slavery, or any chewing agonizing tunnels through living animals, or etc.
Sure, but would the outcome for the aliens be as bad or worse than killing all of them from their perspective? I’m skeptical.
Ty! For the record, my reason for thinking it’s fine to say “if anyone builds it, everyone dies” despite some chance of survival is mostly spelled out here. Relative to the beliefs you spell out above, I think the difference is a combination of (a) it sounds like I find the survival scenarios less likely than you do; (b) it sounds like I’m willing to classify more things as “death” than you are.
For examples of (b): I’m pretty happy to describe as “death” cases where the AI makes things that are to humans what dogs are to wolves, or (more likely) makes some other strange optimized thing that has some distorted relationship to humanity, or cases where digitized backups of humanity are sold to aliens, etc. I feel pretty good about describing many exotic scenarios as “we’d die” to a broad audience, especially in a setting with extreme length constraints (like a book title). If I were to caveat with “except maybe backups of us will be sold to aliens”, I expect most people to be confused and frustrated about me bringing that point up. It looks to me like most of the least-exotic scenarios are ones that rout through things that lay audience members pretty squarely call “death”.
It looks to me like the even more exotic scenarios (where modern individuals get “afterlives”) are in the rough ballpark of quantum immortality / anthropic immortality arguments. AI definitely complicates things and makes some of that stuff more plausible (b/c there’s an entity around that can make trades and has a record of your mind), but it still looks like a very small factor to me (washed out e.g. by alien sales) and feels kinda weird and bad to bring it up in a lay conversation, similar to how it’d be weird and bad to bring up quantum immortality if we were trying to stop a car speeding towards a cliff.
FWIW, insofar as people feel like they can’t literally support the title because they think that backups of humans will be sold to aliens, I encourage them to say as much in plain language (whenever they’re critiquing the title). Like: insofar as folks think the title is causing lay audiences to miss important nuance, I think it’s an important second-degree nuance that the allegedly-missing nuance is “maybe we’ll be sold to aliens”, rather than something less exotic than that.
(b) it sounds like I’m willing to classify more things as “death” than you are.
I don’t think this matters much. I’m happy to consider non-consensual uploading to be death and I’m certainly happy to consider “the humans are modified in some way they would find horrifying (at least on reflection)” to be death. I think “the humans are alive in the normal sense of alive” is totally plausible and I expect some humans to be alive in the normal sense of alive in the majority of worlds where AIs takeover.
Making uploads is barely cheaper than literally keeping physical humans alive after AIs have fully solidified their power I think, maybe 0-3 OOMs more expensive or something, so I don’t think non-consensual uploads are that much of the action. (I do think rounding humans up into shelters is relevant.)
(To answer your direct Q, re: “Have you ever seen someone prominent pushing a case for “optimism” on the basis of causal trade with aliens / acaual trade?”, I have heard “well I don’t think it will actually kill everyone because of acausal trade arguments” enough times that I assumed the people discussing those cases thought the argument was substantial. I’d be a bit surprised if none of the ECLW folks thought it was a substantial reason for optimism. My impression from the discussions was that you & others of similar prominence were in that camp. I’m heartened to hear that you think it’s insubstantial. I’m a little confused why there’s been so much discussion around it if everyone agrees it’s insubstantial, but have updated towards it just being a case of people who don’t notice/buy that it’s washed out by sale to hubble-volume aliens and who are into pedantry. Sorry for falsely implying that you & others of similar prominence thought the argument was substantial; I update.)
(I mean, I think it’s a substantial reason to think that “literally everyone dies” is considerably less likely and makes me not want to say stuff like “everyone dies”, but I just don’t think it implies much optimism exactly because the chance of death still seems pretty high and the value of the future is still lost. Like I don’t consider “misaligned AIs have full control and 80% of humans survive after a violent takeover” to be a good outcome.)
To clarify my views on “will misaligned AIs that succeed in seizing all power have a reasonable chance of keeping (most/many) humans alive”:
I think this isn’t very decision relevant and is not that important. I think AI takeover kills the majority of humans in expectation due to both the takeover itself and killing humans after (as as side effect of industrial expansion, eating the biosphere, etc.) and there is a substantial chance of literal every-single-human-is-dead extinction conditional on AI takeover (30%?). Regardless it destroys most of the potential value of the long run future and I care mostly about this.
So at least for me it isn’t true that “this is really the key hope held by the world’s reassuring voices”. When I discuss how I think about AI risk, this mostly doesn’t come up and when it does I might say something like “AI takeover would probably kill most people and seems extremely bad overall”. Have you ever seen someone prominent pushing a case for “optimism” on the basis of causal trade with aliens / acaual trade?
The reason why I brought up this topic is because I think it’s bad to make incorrect or weak arguments:
I think smart people will (correctly) notice these arguments seem motivated or weak and then on the basis of this epistemic spot check dismiss the rest. In argumentation, avoiding overclaiming has a lot of rhetorical benefits. I was using “but will the AI actually kill everyone” as an example of this. I think the other main case is “before superintelligence, will we be able to get a bunch of help with alignment work?” but there are other examples.
Worse, bad arguments/content result in negative polarization of somewhat higher context people who might otherwise have been somewhat sympathetic or at least indifferent. This is especially costly from the perspective of getting AI company employees to care. I get that you don’t care (much) about AI company employees because you think that radical change is required for their to be any hope, but I think marginal increases in caring among AI company employees substantially reduce risk (though aren’t close to sufficient for the situation being at all reasonable/safe).
Confidently asserted bad arguments and things people strongly disagree make it harder for people to join a coalition. Like, from an integrity perspective, I would need to caveat saying I agree with the book even though I do agree with large chunks of the book and the extent to which I feel the need to caveat this could be reduced. IDK how much you should care about this, but insofar as you care about people like me joining some push you’re trying to make happen this sort of thing makes some difference.
I do think this line of argumentation makes the title literally wrong even if I thought the probability of AI takeover was much higher. I’m not sure how much to care about this, but I do think it randomly imposes a bunch of costs to brand things as “everyone dies” when a substantial fraction of the coalition you might want to work with disagrees and it isn’t a crux. Like, does the message punchyness outweight the costs here from your perspective? IDK.
Responding to some specific points:
I agree that automating alignment with AIs is pretty likely to go very poorly due to incompetence. I think this could go either way and further effort on trying to make this go better is a pretty cost-effective (in terms of using our labor etc) to marginally reduce doom, though it isn’t going to result in a reasonable/safe situation.
To be clear, I don’t think things will be OK exactly nor do I expect that much dignity, though I think I do expect more dignity than you do. My perspective is more like “there seem like there are some pretty effective ways to reduce doom at the margin” than “we’ll be fine because XYZ”.
I don’t think this seriously engages with the argument, though due to this footnote, I retract “they don’t talk at all about trade arguments for keeping humans alive” (I edited my comment).
As far as this section, I agree that it’s totally fine to say “everybody dies” if it’s overwhelmingly more likely everyone dies. I don’t see how this responds to the argument that “it’s not overwhelming likely everyone dies because of acausal (and causal) trade”. I don’t know how important this is, but I also don’t know why you/Eliezer/MIRI feel like it’s so important to argue against this as opposed to saying something like: “AI takeover seems extremely bad and like it would at least kill billions of us. People disagree on exactly how likely vast numbers of humans dying as a result of AI takeover is, but we think it’s at least substantial due to XYZ”. Is it just because you want to use the “everybody dies” part of the title? Fair enough I guess...
Sure, but would the outcome for the aliens be as bad or worse than killing all of them from their perspective? I’m skeptical.
Ty! For the record, my reason for thinking it’s fine to say “if anyone builds it, everyone dies” despite some chance of survival is mostly spelled out here. Relative to the beliefs you spell out above, I think the difference is a combination of (a) it sounds like I find the survival scenarios less likely than you do; (b) it sounds like I’m willing to classify more things as “death” than you are.
For examples of (b): I’m pretty happy to describe as “death” cases where the AI makes things that are to humans what dogs are to wolves, or (more likely) makes some other strange optimized thing that has some distorted relationship to humanity, or cases where digitized backups of humanity are sold to aliens, etc. I feel pretty good about describing many exotic scenarios as “we’d die” to a broad audience, especially in a setting with extreme length constraints (like a book title). If I were to caveat with “except maybe backups of us will be sold to aliens”, I expect most people to be confused and frustrated about me bringing that point up. It looks to me like most of the least-exotic scenarios are ones that rout through things that lay audience members pretty squarely call “death”.
It looks to me like the even more exotic scenarios (where modern individuals get “afterlives”) are in the rough ballpark of quantum immortality / anthropic immortality arguments. AI definitely complicates things and makes some of that stuff more plausible (b/c there’s an entity around that can make trades and has a record of your mind), but it still looks like a very small factor to me (washed out e.g. by alien sales) and feels kinda weird and bad to bring it up in a lay conversation, similar to how it’d be weird and bad to bring up quantum immortality if we were trying to stop a car speeding towards a cliff.
FWIW, insofar as people feel like they can’t literally support the title because they think that backups of humans will be sold to aliens, I encourage them to say as much in plain language (whenever they’re critiquing the title). Like: insofar as folks think the title is causing lay audiences to miss important nuance, I think it’s an important second-degree nuance that the allegedly-missing nuance is “maybe we’ll be sold to aliens”, rather than something less exotic than that.
I don’t think this matters much. I’m happy to consider non-consensual uploading to be death and I’m certainly happy to consider “the humans are modified in some way they would find horrifying (at least on reflection)” to be death. I think “the humans are alive in the normal sense of alive” is totally plausible and I expect some humans to be alive in the normal sense of alive in the majority of worlds where AIs takeover.
Making uploads is barely cheaper than literally keeping physical humans alive after AIs have fully solidified their power I think, maybe 0-3 OOMs more expensive or something, so I don’t think non-consensual uploads are that much of the action. (I do think rounding humans up into shelters is relevant.)
(To answer your direct Q, re: “Have you ever seen someone prominent pushing a case for “optimism” on the basis of causal trade with aliens / acaual trade?”, I have heard “well I don’t think it will actually kill everyone because of acausal trade arguments” enough times that I assumed the people discussing those cases thought the argument was substantial. I’d be a bit surprised if none of the ECLW folks thought it was a substantial reason for optimism. My impression from the discussions was that you & others of similar prominence were in that camp. I’m heartened to hear that you think it’s insubstantial. I’m a little confused why there’s been so much discussion around it if everyone agrees it’s insubstantial, but have updated towards it just being a case of people who don’t notice/buy that it’s washed out by sale to hubble-volume aliens and who are into pedantry. Sorry for falsely implying that you & others of similar prominence thought the argument was substantial; I update.)
(I mean, I think it’s a substantial reason to think that “literally everyone dies” is considerably less likely and makes me not want to say stuff like “everyone dies”, but I just don’t think it implies much optimism exactly because the chance of death still seems pretty high and the value of the future is still lost. Like I don’t consider “misaligned AIs have full control and 80% of humans survive after a violent takeover” to be a good outcome.)