There is one counterargument that I sometimes hear that I’m not sure how convincing I should find:
AI will bring unprecedented and unimaginable wealth.
More than zero people are more than zero altruistic.
It might not be a stretch to argue that at least some altruistic people might end up with some considerable portion of that large wealth and power.
Therefore: some of these somewhat-well-off somewhat-altruists[1] will rather give up little bits of their wealth[2] and power than to see the largest humanitarian catastrophe ever unfold before their eyes, in no small part due to their inaction playing a central role, especially if they have to give up comparatively so little to save so many.
Do you agree or disagree with any parts of this?
p.s. this might go without saying but this question might only be relevant if technical alignment can be and is solved in any fashion. With that said I think it’s entirely good to ask this question lest we find ourselves in a world where we clear one impossible seeming hurdle and still find ourselves in a world of hurt all the same.
This only needs there to exist something of a pareto frontier of either very altruistic okay-offs, or well-off only-a-little-altruists, or somewhere in-between. If we have many very altruistic very-well-offs, then the argument might just make itself, so I’m arguing in a less convenient context.
This might truly be tiny indeed, like one one-millionth of someone’s wealth, truly a rounding error. Someone arguing for side A might be positing a very large amount of callousness if all other points stand. Or indifference. Or some other force that pushes against the desire to help.
It’s also quite possible that some will be sadistic. Once powerful AI is in the picture, it also unlocks cheap, convenient, easy-to-apply, extremely potent brain-computer-interfaces that can mentally enslave people. And that sort of thing snowballs, because the more loyal-unto-death servants you have, the easier it is to kidnap and convert new people.
Along with other tech potentially unlocking things like immortality, and you have a recipe for things going quite badly if some sadist gets enough power to ramp into even more…
I mean, plus the weird race dynamics of the AI itself. Will the few controllers of AI cooperate peacefully, or occasionally get into arguments and get jealous of each other’s power? Might they one day get into a fight or significant competition that causes them to aim for even stronger AI servants to best their opponent, and thus leads to them losing control? Or a wide variety of other ways humans may fail to remain consistently sensible over a long period. It seems to me pretty likely that even 1 out of 1000 of the AI Lords losing control could easily lead to their uncontrolled AI self-improving enough to escape and conquer all the AI Lords. Just doesn’t seem like a sane and stable configuration for humanity to aim for, insofar as we are able to aim for anything.
The attractor basin around ‘genuinely nice value-aligned AI’ seems a lot more promising to me than ‘obedient AI controlled by centralized human power’. MIRI & co make arguments about a ‘near miss’ on value alignment being catastrophic, but after years of thought and debate on the subject, I’ve come around to disagreeing with this point. A really smart, really powerful AI that is trying its best to help humanity and satisfying humanity’s extrapolated values as best it can seems likely to… approach the problem intelligently. Like, recognize the need for epistemic humility and for continued human progress....
If there’s a small class of people with immense power over billions of have-nothings that can do nothing back, sure, some of the superpowerful will be more than zero altruistic. But others won’t be, and overall I expect callousness and abuse of power to much outweigh altruism. Most people are pretty corruptible by power, especially when it’s power over a distinct outgroup, and pretty indifferent to abuses of power happening to the outgroup; all history shows that. Bigger differences in power will make it worse if anything.
I think I see somewhat where you are coming from, but can you spell it out for me a bit more? Maybe through describing a somewhat fleshed out concrete example scenario all the while I can acknowledge that this is just one hastily put together possibility of many.
Let me start by proposing one such possibility but feel free to start going in another direction entirely too. Let’s suppose the altruistic few put together sanctuaries or “wild human life reserves”, how might this play out after this? Will the selfish ones somehow try to intrude or curtail this practice? By our scenarios granted premises, the altruistic ones do wield real power, and they do use some fraction of it to maintain this sanctum. Even if the others are many, would they have a lot to gain by trying to mess with this? Is it just entertainment, or sport for them? What do they stand to gain? Not really anything economic or more power, or maybe you think that they do?
Why do you think all poor people will end up in these “wildlife preserves”, and not somewhere else under the power of someone less altruistic? A future of large power differences is… a future of large power differences.
I do buy this, but note this requires fairly drastic actions that essentially amount to a pivotal act using an AI to coup society and government, because they have a limited time window in which to act before economic incentives means that most of the others kill/enslave almost everyone else.
Contra cousin_it, I basically don’t buy the story that power corrupts/changes your values, instead it corrupts your world model because there’s a very large incentive for your underlings to misreport things that flatter you, but conditional on technical alignment being solved, this doesn’t matter anymore, so I think power-grabs might not result in as bad of an outcome as we feared.
But this does require pretty massive changes to ensure the altruists stay in power, and they are not prepared to think about what this will mean.
All of 1-4 seem plausible to me, and I don’t centrally expect that power concentration will lead to everyone dying.
Even if all of 1-4 hold, I think the future will probably be a lot less good than it could have been: - 4 is more likely to mean that earth becomes a nature reserve for humans or something, than that the stars are equitably allocated
- I’m worried that there are bad selection effects such that 3 already screens out some kinds of altruists (e.g. ones who aren’t willing to strategy steal). Some good stuff might still happen to existing humans, but the future will miss out on some values completely
- I’m worried about power corrupting/there being no checks and balances/there being no incentives to keep doing good stuff for others
I think this might happen early on. But if it keeps going, and the gap keeps widening, and then maybe the AI controllers get some kind of body or mental enhancement, then the material incentives obviously point in the direction of “ditch those other nobodies”, and then ideology arises to justify why ditching those other nobodies is just and right.
Consider this: when the Europeans started colonising the New World, it turned out that it would be extremely convenient to have free manual labour to bootstrap agriculture quickly. Around this time, coincidentally, the same white Christian Europeans who had been relatively anti-slavery (at least against enslaving other Christians) since Roman times, and fairly uninterested in the goings on of Africa, found within themselves a deep urge to go get random Africans, put them in chains, and forcefully convert them while keeping them enslaved as a way to give meaning to their otherwise pagan and inferior existences. Well, first they started with the natives, and then when they ran out of those, they looked at Africa instead. Similar impulses to altruistically civilize all the poor barbarians across the world arose just as soon as there were shipping fleets, arms, and manpower sufficient to make resource-extracting colonies quite profitable enterprises.
That seems an extremely weird coincidence if it wasn’t just a case of people rationalizing why they should do obviously evil things that were however obviously convenient too.
I think the specific shape is less important than the obvious general trend, regardless of what form it takes—forms tend to be incidental to circumstances, but incentives and responses to them are much more reliable.
That said, I would say the most “peaceful” version of it looks like people getting some kind of UBI allowance and stuff to buy with them, but the stuff keeps becoming less and less (as it gets redirected to “worthier” enterprises) and more expensive. As conditions worsen and people are generally depressed and have no belief in the future they simply have less children. Possibly forms of wireheading get promoted—this does not even need to be some kind of nefarious plan, the altruists among the AI Lords may genuinely believe it’s an operation of relief for those who feel purposeless among the masses, and it’s the best they can do at any given time. This of course however results in massive drops in birth rates and general engagement with the real world. The less people engage with the world, the cheaper their maintenance is, the more pressure builds up on those who refuse the wireheading—less demand means less offer and then the squeeze hurts those who still want to hang on to the now obsolete lifestyle. Add the obvious possibility of violent, impotent lashing out from the masses followed by entitled outrage at the ingratitude of it all which justifies bloody repression. The ones left outside of the circle get chipped away at until they whittle into almost nothing, and the ones inside keep passively reaping the benefits as all they do is always nominally just within their right to either disposing of their own property or defending against violent aggression. Note how most of this is really just an extrapolation on turbo-steroids of trends we can all already see having emerged in industrialised societies.
There is one counterargument that I sometimes hear that I’m not sure how convincing I should find:
AI will bring unprecedented and unimaginable wealth.
More than zero people are more than zero altruistic.
It might not be a stretch to argue that at least some altruistic people might end up with some considerable portion of that large wealth and power.
Therefore: some of these somewhat-well-off somewhat-altruists[1] will rather give up little bits of their wealth[2] and power than to see the largest humanitarian catastrophe ever unfold before their eyes, in no small part due to their inaction playing a central role, especially if they have to give up comparatively so little to save so many.
Do you agree or disagree with any parts of this?
p.s. this might go without saying but this question might only be relevant if technical alignment can be and is solved in any fashion. With that said I think it’s entirely good to ask this question lest we find ourselves in a world where we clear one impossible seeming hurdle and still find ourselves in a world of hurt all the same.
This only needs there to exist something of a pareto frontier of either very altruistic okay-offs, or well-off only-a-little-altruists, or somewhere in-between. If we have many very altruistic very-well-offs, then the argument might just make itself, so I’m arguing in a less convenient context.
This might truly be tiny indeed, like one one-millionth of someone’s wealth, truly a rounding error. Someone arguing for side A might be positing a very large amount of callousness if all other points stand. Or indifference. Or some other force that pushes against the desire to help.
It’s also quite possible that some will be sadistic. Once powerful AI is in the picture, it also unlocks cheap, convenient, easy-to-apply, extremely potent brain-computer-interfaces that can mentally enslave people. And that sort of thing snowballs, because the more loyal-unto-death servants you have, the easier it is to kidnap and convert new people. Along with other tech potentially unlocking things like immortality, and you have a recipe for things going quite badly if some sadist gets enough power to ramp into even more… I mean, plus the weird race dynamics of the AI itself. Will the few controllers of AI cooperate peacefully, or occasionally get into arguments and get jealous of each other’s power? Might they one day get into a fight or significant competition that causes them to aim for even stronger AI servants to best their opponent, and thus leads to them losing control? Or a wide variety of other ways humans may fail to remain consistently sensible over a long period. It seems to me pretty likely that even 1 out of 1000 of the AI Lords losing control could easily lead to their uncontrolled AI self-improving enough to escape and conquer all the AI Lords. Just doesn’t seem like a sane and stable configuration for humanity to aim for, insofar as we are able to aim for anything. The attractor basin around ‘genuinely nice value-aligned AI’ seems a lot more promising to me than ‘obedient AI controlled by centralized human power’. MIRI & co make arguments about a ‘near miss’ on value alignment being catastrophic, but after years of thought and debate on the subject, I’ve come around to disagreeing with this point. A really smart, really powerful AI that is trying its best to help humanity and satisfying humanity’s extrapolated values as best it can seems likely to… approach the problem intelligently. Like, recognize the need for epistemic humility and for continued human progress....
If there’s a small class of people with immense power over billions of have-nothings that can do nothing back, sure, some of the superpowerful will be more than zero altruistic. But others won’t be, and overall I expect callousness and abuse of power to much outweigh altruism. Most people are pretty corruptible by power, especially when it’s power over a distinct outgroup, and pretty indifferent to abuses of power happening to the outgroup; all history shows that. Bigger differences in power will make it worse if anything.
I think I see somewhat where you are coming from, but can you spell it out for me a bit more? Maybe through describing a somewhat fleshed out concrete example scenario all the while I can acknowledge that this is just one hastily put together possibility of many.
Let me start by proposing one such possibility but feel free to start going in another direction entirely too. Let’s suppose the altruistic few put together sanctuaries or “wild human life reserves”, how might this play out after this? Will the selfish ones somehow try to intrude or curtail this practice? By our scenarios granted premises, the altruistic ones do wield real power, and they do use some fraction of it to maintain this sanctum. Even if the others are many, would they have a lot to gain by trying to mess with this? Is it just entertainment, or sport for them? What do they stand to gain? Not really anything economic or more power, or maybe you think that they do?
Why do you think all poor people will end up in these “wildlife preserves”, and not somewhere else under the power of someone less altruistic? A future of large power differences is… a future of large power differences.
I do buy this, but note this requires fairly drastic actions that essentially amount to a pivotal act using an AI to coup society and government, because they have a limited time window in which to act before economic incentives means that most of the others kill/enslave almost everyone else.
Contra cousin_it, I basically don’t buy the story that power corrupts/changes your values, instead it corrupts your world model because there’s a very large incentive for your underlings to misreport things that flatter you, but conditional on technical alignment being solved, this doesn’t matter anymore, so I think power-grabs might not result in as bad of an outcome as we feared.
But this does require pretty massive changes to ensure the altruists stay in power, and they are not prepared to think about what this will mean.
All of 1-4 seem plausible to me, and I don’t centrally expect that power concentration will lead to everyone dying.
Even if all of 1-4 hold, I think the future will probably be a lot less good than it could have been:
- 4 is more likely to mean that earth becomes a nature reserve for humans or something, than that the stars are equitably allocated
- I’m worried that there are bad selection effects such that 3 already screens out some kinds of altruists (e.g. ones who aren’t willing to strategy steal). Some good stuff might still happen to existing humans, but the future will miss out on some values completely
- I’m worried about power corrupting/there being no checks and balances/there being no incentives to keep doing good stuff for others
I think this might happen early on. But if it keeps going, and the gap keeps widening, and then maybe the AI controllers get some kind of body or mental enhancement, then the material incentives obviously point in the direction of “ditch those other nobodies”, and then ideology arises to justify why ditching those other nobodies is just and right.
Consider this: when the Europeans started colonising the New World, it turned out that it would be extremely convenient to have free manual labour to bootstrap agriculture quickly. Around this time, coincidentally, the same white Christian Europeans who had been relatively anti-slavery (at least against enslaving other Christians) since Roman times, and fairly uninterested in the goings on of Africa, found within themselves a deep urge to go get random Africans, put them in chains, and forcefully convert them while keeping them enslaved as a way to give meaning to their otherwise pagan and inferior existences. Well, first they started with the natives, and then when they ran out of those, they looked at Africa instead. Similar impulses to altruistically civilize all the poor barbarians across the world arose just as soon as there were shipping fleets, arms, and manpower sufficient to make resource-extracting colonies quite profitable enterprises.
That seems an extremely weird coincidence if it wasn’t just a case of people rationalizing why they should do obviously evil things that were however obviously convenient too.
Thanks for your response, can I ask the same question of you as I do here in this cousin comment?
I think the specific shape is less important than the obvious general trend, regardless of what form it takes—forms tend to be incidental to circumstances, but incentives and responses to them are much more reliable.
That said, I would say the most “peaceful” version of it looks like people getting some kind of UBI allowance and stuff to buy with them, but the stuff keeps becoming less and less (as it gets redirected to “worthier” enterprises) and more expensive. As conditions worsen and people are generally depressed and have no belief in the future they simply have less children. Possibly forms of wireheading get promoted—this does not even need to be some kind of nefarious plan, the altruists among the AI Lords may genuinely believe it’s an operation of relief for those who feel purposeless among the masses, and it’s the best they can do at any given time. This of course however results in massive drops in birth rates and general engagement with the real world. The less people engage with the world, the cheaper their maintenance is, the more pressure builds up on those who refuse the wireheading—less demand means less offer and then the squeeze hurts those who still want to hang on to the now obsolete lifestyle. Add the obvious possibility of violent, impotent lashing out from the masses followed by entitled outrage at the ingratitude of it all which justifies bloody repression. The ones left outside of the circle get chipped away at until they whittle into almost nothing, and the ones inside keep passively reaping the benefits as all they do is always nominally just within their right to either disposing of their own property or defending against violent aggression. Note how most of this is really just an extrapolation on turbo-steroids of trends we can all already see having emerged in industrialised societies.