I’d normally be wary of criticizing this, because it’s written by Bostrom, one of the earliest and clearest thinkers on AI risk. But I think the argument is wrong.
The argument compares “business as usual” (people living normal lives and then dying), “AI gives everyone longevity”, and “AI kills everyone”. But these are not the only possibilities. A neglected fourth possibility is that AI makes things much worse than just killing everyone. For example, if large numbers of people end up in inescapable servitude. I think such outcomes are actually typical in case of many near-misses at alignment, including the particular near-miss that’s becoming more probable day by day: if the powerful successfully align the AI to themselves, and it enables them to lord it over the powerless forever. To believe that the powerful will be nice to the powerless of their own accord, given our knowledge of history, is very rose-colored thinking.
For example, in one of the past threads someone suggested to me that since human nature contains nonzero altruism, some of the powerful people of the future will set up “nature preserves” where the powerless can live happy lives. When I pointed out that human nature also contains other nonzero urges besides altruism, and asked why most of the powerless will end up in these “nature preserves” rather than somewhere else run by a less-nice powerful person, I got no answer.
The economic rationale for human servitude disappears when the machines are better than humans at everything. That doesn’t prevent sadistic mistreatment or killing the poor to use their atoms for something else, but it’s a major disanalogy from history. What lessons you draw probably depend on whether you think the rich and powerful are sadistic (actively wanting to harm the poor) or merely mostly selfish (happy to help the poor if it’s trivially cheap or they get their name on a building in return, but not if it makes a dent in the yacht and caviar budget).
“Actively wanting to harm the poor” doesn’t strike at the heart of the issue. Nor is it about economics. The issue is that the powerful want to feel socially dominant. There have been plenty of historical examples where this turned ugly.
I’m maybe more attuned to this than most people. I still remember my first time (as a child) going to a restaurant that had waiters, and feeling very clearly that being waited-on was not only about getting food, but also partly an ugly dominance ritual that I wanted no part of. On the same continuum you have kings forcing subjects to address them as “Your Majesty”: it still kinda blows my mind that that was a real thing.
I see. I think you should write a post trying to imagine in detail the failure modes you foresee if AI is aligned to the rich and powerful. What happens to the masses in those worlds, specifically? Are they killed, tortured, forced to work as waiters, or what? I have “merely mostly selfish” psych intuitions, so when I imagine Sam Altman being God-Emperor, I imagine that being like “luxury post-scarcity utopia except everyone has been brainwashed to express gratitude to the God-Emperor Sam I for giving them eternal life in utopia”, which is not ideal, but still arguably vastly better than worlds (like the status quo) with death and suffering. If you’re envisioning something darker, I think being more concrete would help puncture the optimism of people like me.
Hmm hm. Being forced to play out a war? Getting people’s minds modified so they behave like house elves from HP? Selective breeding? Selling some of your poor people to another rich person who’d like to have them? It’s not even like I’m envisioning something specific that’s dark, I just know that a world where some human beings have absolute root-level power over many others is gonna be dark. Let’s please not build such a world.
The powerful want to be socially dominant, but to what extent are they willing to engineer painful conditions to experience a greater high from the quality of life disparity? In a world with robots delivering extreme material abundance, this kind of is “actively wanting to harm the poor”. It’s true that some sadistically enjoy signs of disparity, but how much of that is extracting pleasure from the economic realities compared to it being the intrinsic motivation for the power?
I’m not sure on what the right way to model how this will play out, but my guess is that the outcome isn’t knowable from where we stand. I think it will heavily depend on:
The particular predispositions of the powerful people pushing the technology
My read of Bostrom’s intent is that s-risks are deliberately excluded because they fall under the “arcane” category of considerations (per Evaluative Framework section), and this is supposed to be looking simply at Overton Window tradeoffs around lives saved.
However, I think you could still make a fair argument that s-risks could fall within the Overton Window if framed correctly, ex. “consider the possibility your ideological/political enemies win forever”. This is already part of the considerations being made by AI labs and relevant governments in as simple terms as US vs. China.[1] Still, I think the narrower analysis done by Bostrom here is still interesting.
One might argue this is not a “real” s-risk, but ex. Anthropic’s Dario Amodei seems pretty willing to risk the destruction of humanity over China reaching ASI first, according to his public statements, so I think it counts as a meaningful consideration in the public discourse outside of mere lives saved/lost.
I’d normally be wary of criticizing this, because it’s written by Bostrom, one of the earliest and clearest thinkers on AI risk. But I think the argument is wrong.
The argument compares “business as usual” (people living normal lives and then dying), “AI gives everyone longevity”, and “AI kills everyone”. But these are not the only possibilities. A neglected fourth possibility is that AI makes things much worse than just killing everyone. For example, if large numbers of people end up in inescapable servitude. I think such outcomes are actually typical in case of many near-misses at alignment, including the particular near-miss that’s becoming more probable day by day: if the powerful successfully align the AI to themselves, and it enables them to lord it over the powerless forever. To believe that the powerful will be nice to the powerless of their own accord, given our knowledge of history, is very rose-colored thinking.
For example, in one of the past threads someone suggested to me that since human nature contains nonzero altruism, some of the powerful people of the future will set up “nature preserves” where the powerless can live happy lives. When I pointed out that human nature also contains other nonzero urges besides altruism, and asked why most of the powerless will end up in these “nature preserves” rather than somewhere else run by a less-nice powerful person, I got no answer.
The economic rationale for human servitude disappears when the machines are better than humans at everything. That doesn’t prevent sadistic mistreatment or killing the poor to use their atoms for something else, but it’s a major disanalogy from history. What lessons you draw probably depend on whether you think the rich and powerful are sadistic (actively wanting to harm the poor) or merely mostly selfish (happy to help the poor if it’s trivially cheap or they get their name on a building in return, but not if it makes a dent in the yacht and caviar budget).
“Actively wanting to harm the poor” doesn’t strike at the heart of the issue. Nor is it about economics. The issue is that the powerful want to feel socially dominant. There have been plenty of historical examples where this turned ugly.
I’m maybe more attuned to this than most people. I still remember my first time (as a child) going to a restaurant that had waiters, and feeling very clearly that being waited-on was not only about getting food, but also partly an ugly dominance ritual that I wanted no part of. On the same continuum you have kings forcing subjects to address them as “Your Majesty”: it still kinda blows my mind that that was a real thing.
I see. I think you should write a post trying to imagine in detail the failure modes you foresee if AI is aligned to the rich and powerful. What happens to the masses in those worlds, specifically? Are they killed, tortured, forced to work as waiters, or what? I have “merely mostly selfish” psych intuitions, so when I imagine Sam Altman being God-Emperor, I imagine that being like “luxury post-scarcity utopia except everyone has been brainwashed to express gratitude to the God-Emperor Sam I for giving them eternal life in utopia”, which is not ideal, but still arguably vastly better than worlds (like the status quo) with death and suffering. If you’re envisioning something darker, I think being more concrete would help puncture the optimism of people like me.
Hmm hm. Being forced to play out a war? Getting people’s minds modified so they behave like house elves from HP? Selective breeding? Selling some of your poor people to another rich person who’d like to have them? It’s not even like I’m envisioning something specific that’s dark, I just know that a world where some human beings have absolute root-level power over many others is gonna be dark. Let’s please not build such a world.
The powerful want to be socially dominant, but to what extent are they willing to engineer painful conditions to experience a greater high from the quality of life disparity? In a world with robots delivering extreme material abundance, this kind of is “actively wanting to harm the poor”. It’s true that some sadistically enjoy signs of disparity, but how much of that is extracting pleasure from the economic realities compared to it being the intrinsic motivation for the power?
I’m not sure on what the right way to model how this will play out, but my guess is that the outcome isn’t knowable from where we stand. I think it will heavily depend on:
The particular predispositions of the powerful people pushing the technology
The shape of the tech tree and how we explore it
My read of Bostrom’s intent is that s-risks are deliberately excluded because they fall under the “arcane” category of considerations (per Evaluative Framework section), and this is supposed to be looking simply at Overton Window tradeoffs around lives saved.
However, I think you could still make a fair argument that s-risks could fall within the Overton Window if framed correctly, ex. “consider the possibility your ideological/political enemies win forever”. This is already part of the considerations being made by AI labs and relevant governments in as simple terms as US vs. China.[1] Still, I think the narrower analysis done by Bostrom here is still interesting.
One might argue this is not a “real” s-risk, but ex. Anthropic’s Dario Amodei seems pretty willing to risk the destruction of humanity over China reaching ASI first, according to his public statements, so I think it counts as a meaningful consideration in the public discourse outside of mere lives saved/lost.