Sense-making about extreme power concentration
Various people are worried about AI causing extreme power concentration of some form, for example via:
I have been talking to some of these people and trying to sense-make about ‘power concentration’.
These are some notes on that, mostly prompted by some comment exchanges with Nora Ammann (in the below I’m riffing on her ideas but not representing her views). Sharing because I found some of the below helpful for thinking with, and maybe others will too.
(I haven’t tried to give lots of context, so it probably makes most sense to people who’ve already thought about this. More the flavour of in progress research notes than ‘here’s a crisp insight everyone should have’.)
AI risk as power concentration
Sometimes when people talk about power concentration it sounds to me like they are talking about most of AI risk, including AI takeover and human powergrabs.
To try to illustrate this, here’s a 2x2, where the x axis is whether AIs or humans end up with concentrated power, and the y axis is whether power gets concentrated emergently or through power-seeking:
Some things I like about this 2x2:
It suggests a spectrum between AI and human power. You could try to police the line the y axis draws between human and AI, and say ‘this counts as human takeover, this counts as AI takeover’. But I expect it’s usually more useful to hold it as a messy spectrum where things just shade into one another.
It also suggests a spectrum between power-seeking and emergent. I expect that in most scenarios there will be some of both, and again I expect policing the line won’t be that helpful.
I don’t think it’s obvious that we should have strong preferences about where we fall on either spectrum. Human power concentration doesn’t seem obviously better than AI power concentration, emergent doesn’t seem obviously less bad than power-seeking. I’d like people with strong opinions that we should prefer places on these spectrums to make that case.
For me, when I look at the 2x2 I think that power-seeking vs emergent is more where it’s at than sudden vs gradual. You could imagine a powergrab which unfolds gradually over a number of years through a process of backsliding, and it’s very deliberate. You could also imagine most humans being disempowered quite quickly by emergent dynamics like races to the bottom in epistemics—e.g. crazy superpersuasion capabilities get deployed and quickly make everyone kind of insane.
Overall, this isn’t my preferred way of thinking about AI risk or power concentration, for two reasons:
I think it’s useful to have more granular categories, and don’t love collapsing everything into one container
Misaligned AI takeover and gradual disempowerment could result in power concentration (where most power is in the hands of a small number of actors), but could also result in a power shift (where power is still in the hands of lots of actors, just none of them are human). I don’t have much inside view on how much probability mass should go on power shift vs power concentration here, and would be interested in people’s takes.
But since drawing this 2x2 I more get why for some people this makes sense as a frame.
Power concentration as the undermining of checks and balances
Nora’s take is that the most important general factor driving the risk of power concentration is the undermining of legacy checks and balances.
Here’s my attempt to flesh out what this means (Nora would probably say something different):
The world is unequal today and power is often abused, but there are limits to this
Specifically, there are a bunch of checks and balances on power, including:
Formal/bureaucratic checks: the law, elections
Checks on hard power: capital, military force and the ability to earn money are all somewhat distributed, so there are other powerful actors who can keep you in check
Soft checks: access to information, truth often being more convincing than lies
These checks provide real (if partial) checks on power at human levels
AI will radically increase the power of some actors (in speed and in raw capabilities). Many of these checks won’t hold up any more to that, potentially leading to very concentrated amounts of unchecked power
Some things I like about this frame:
One thing is that it feels more mechanistic to me than abstractions like ‘the economy, culture, states’. When I read the gradual disempowerment paper I felt a bit disorientated, and now I more get why they chose these things to focus on
Somehow it stands out as very obvious to me in the above that eroding one check makes it much easier to erode the other ones. You can still think this in other frames, but for me it’s especially salient in this one. I’m a lot more worried about formal checks being undermined if hard power and ability to sense-make is already very concentrated, and similarly for other combinations.
It feels juicy for thinking about more specific scenarios. E.g.
Formal checks could get eroded suddenly through powergrabs, or gradually through backsliding. Possible they could also get eroded emergently rather than deliberately, if the pace of change gets way faster than election cycles and/or companies get way more capable than governments
Concentrating hard power feels very closely tied to automating human labour. I think the more labour automation there is, the more capital and the ability to earn money and wage war will get concentrated. I do think you could get military concentration prior to rolling out the automation of human labour though, if one actor develops much more powerful military tech than everyone else very rapidly.
The soft checks could get eroded either by deliberate adversarial interference on the part of powerful actors, or emergently through races to the bottom if the incentives are bad or we’re unlucky about what technologies get developed when (e.g. crazy superpersuasion before epistemic defenses to that). Eroding the soft checks would in itself disempower a lot of people (who’d no longer be able to effectively act in their own interests), and could also make it much easier for people to concentrate power even further.
Yeah. I think the main reason democracy exists at all is because people are necessary for skilled labor and for fighting wars. If that goes away, the result will be a world where money and power just discards most people. Why some people think “oh we’ll implement UBI if it comes to that”, I have no idea. When it “comes to that”, there won’t be any force powerful enough to implement UBI and interested in doing so. My cynical view is that the promise of distributing AI benefits in the future is a distraction: look over there, while we take all power.
I think it’s relatively genuine, it’s just misguided because it comes from people who believe ideology is upstream of material incentives rather than the other way around. I think ideology has a way to go against material incentives… potentially… for a time. But it can’t take sustained pressure. Even if you did have someone implementing UBI under those conditions, after a few years or decades the question “but why are we supporting all these useless parasites whom we don’t need rather than take all the resources for ourselves?” would arise. The best contribution the rest of humanity seems to offer at that point is “genetic diversity to avoid the few hundreds of AI Lords becoming more inbred than the Habsburgs” and that’s not a lot.
There is one counterargument that I sometimes hear that I’m not sure how convincing I should find:
AI will bring unprecedented and unimaginable wealth.
More than zero people are more than zero altruistic.
It might not be a stretch to argue that at least some altruistic people might end up with some considerable portion of that large wealth and power.
Therefore: some of these somewhat-well-off somewhat-altruists[1] will rather give up little bits of their wealth[2] and power than to see the largest humanitarian catastrophe ever unfold before their eyes, in no small part due to their inaction playing a central role, especially if they have to give up comparatively so little to save so many.
Do you agree or disagree with any parts of this?
p.s. this might go without saying but this question might only be relevant if technical alignment can be and is solved in any fashion. With that said I think it’s entirely good to ask this question lest we find ourselves in a world where we clear one impossible seeming hurdle and still find ourselves in a world of hurt all the same.
This only needs there to exist something of a pareto frontier of either very altruistic okay-offs, or well-off only-a-little-altruists, or somewhere in-between. If we have many very altruistic very-well-offs, then the argument might just make itself, so I’m arguing in a less convenient context.
This might truly be tiny indeed, like one one-millionth of someone’s wealth, truly a rounding error. Someone arguing for side A might be positing a very large amount of callousness if all other points stand. Or indifference. Or some other force that pushes against the desire to help.
I do buy this, but note this requires fairly drastic actions that essentially amount to a pivotal act using an AI to coup society and government, because they have a limited time window in which to act before economic incentives means that most of the others kill/enslave almost everyone else.
Contra cousin_it, I basically don’t buy the story that power corrupts/changes your values, instead it corrupts your world model because there’s a very large incentive for your underlings to misreport things that flatter you, but conditional on technical alignment being solved, this doesn’t matter anymore, so I think power-grabs might not result in as bad of an outcome as we feared.
But this does require pretty massive changes to ensure the altruists stay in power, and they are not prepared to think about what this will mean.
If there’s a small class of people with immense power over billions of have-nothings that can do nothing back, sure, some of the superpowerful will be more than zero altruistic. But others won’t be, and overall I expect callousness and abuse of power to much outweigh altruism. Most people are pretty corruptible by power, especially when it’s power over a distinct outgroup, and pretty indifferent to abuses of power happening to the outgroup; all history shows that. Bigger differences in power will make it worse if anything.
I think I see somewhat where you are coming from, but can you spell it out for me a bit more? Maybe through describing a somewhat fleshed out concrete example scenario all the while I can acknowledge that this is just one hastily put together possibility of many.
Let me start by proposing one such possibility but feel free to start going in another direction entirely too. Let’s suppose the altruistic few put together sanctuaries or “wild human life reserves”, how might this play out after this? Will the selfish ones somehow try to intrude or curtail this practice? By our scenarios granted premises, the altruistic ones do wield real power, and they do use some fraction of it to maintain this sanctum. Even if the others are many, would they have a lot to gain by trying to mess with this? Is it just entertainment, or sport for them? What do they stand to gain? Not really anything economic or more power, or maybe you think that they do?
Why do you think all poor people will end up in these “wildlife preserves”, and not somewhere else under the power of someone less altruistic? A future of large power differences is… a future of large power differences.
I think this might happen early on. But if it keeps going, and the gap keeps widening, and then maybe the AI controllers get some kind of body or mental enhancement, then the material incentives obviously point in the direction of “ditch those other nobodies”, and then ideology arises to justify why ditching those other nobodies is just and right.
Consider this: when the Europeans started colonising the New World, it turned out that it would be extremely convenient to have free manual labour to bootstrap agriculture quickly. Around this time, coincidentally, the same white Christian Europeans who had been relatively anti-slavery (at least against enslaving other Christians) since Roman times, and fairly uninterested in the goings on of Africa, found within themselves a deep urge to go get random Africans, put them in chains, and forcefully convert them while keeping them enslaved as a way to give meaning to their otherwise pagan and inferior existences. Well, first they started with the natives, and then when they ran out of those, they looked at Africa instead. Similar impulses to altruistically civilize all the poor barbarians across the world arose just as soon as there were shipping fleets, arms, and manpower sufficient to make resource-extracting colonies quite profitable enterprises.
That seems an extremely weird coincidence if it wasn’t just a case of people rationalizing why they should do obviously evil things that were however obviously convenient too.
Thanks for your response, can I ask the same question of you as I do here in this cousin comment?
I think the specific shape is less important than the obvious general trend, regardless of what form it takes—forms tend to be incidental to circumstances, but incentives and responses to them are much more reliable.
That said, I would say the most “peaceful” version of it looks like people getting some kind of UBI allowance and stuff to buy with them, but the stuff keeps becoming less and less (as it gets redirected to “worthier” enterprises) and more expensive. As conditions worsen and people are generally depressed and have no belief in the future they simply have less children. Possibly forms of wireheading get promoted—this does not even need to be some kind of nefarious plan, the altruists among the AI Lords may genuinely believe it’s an operation of relief for those who feel purposeless among the masses, and it’s the best they can do at any given time. This of course however results in massive drops in birth rates and general engagement with the real world. The less people engage with the world, the cheaper their maintenance is, the more pressure builds up on those who refuse the wireheading—less demand means less offer and then the squeeze hurts those who still want to hang on to the now obsolete lifestyle. Add the obvious possibility of violent, impotent lashing out from the masses followed by entitled outrage at the ingratitude of it all which justifies bloody repression. The ones left outside of the circle get chipped away at until they whittle into almost nothing, and the ones inside keep passively reaping the benefits as all they do is always nominally just within their right to either disposing of their own property or defending against violent aggression. Note how most of this is really just an extrapolation on turbo-steroids of trends we can all already see having emerged in industrialised societies.
My understanding is that this explains what is happening in poor countries and why democracies came to be quite well, but I don’t think it’s obvious this also applies in situations where there are enough resources to make everyone somewhat rich because betting on a coup to concentrate more power in the hands of your faction becomes less appealing as people get closer to saturating their selfish preferences.
When Norway started to make a lot of money with oil, my understanding is that it didn’t become significantly more authoritarian, and that if its natural resource rent exploded to become 95% of its GDP, there would be a decent amount of redistribution (maybe not an egalitarian distribution, but sufficient redistribution that you would be better off being in the bottom 10% of citizens of this hypothetical Norway than in the top 10% of any major Western country today) and it would remain a democracy.
I think the core blocker here is existential threats to factions if they don’t gain power, or even perceived existential threats to factions will become more common, because you’ve removed the constraint of popular revolt/ability to sabotage logistics, and you’ve made coordination to remove other factions very easy, and I expect identity-based issues to be much more common post-AGI as the constraint of wealth/power is almost completely removed, and these issues tend to be easily heightened to existential stakes.
I’m not giving examples here, even though they exist, since they’re far too political for LW, and LW’s norm against politics is especially important to preserve here, because these existential issues would over time make LW comment/post quality much worse.
More generally, I’m more pessimistic about what a lot of people would do to other people if they didn’t have a reason to fear any feedback/backlash.
Edit: I removed the section on the more common case being Nigeria or Congo.
These are not valid counterexamples because my argument was very specifically about rich countries that are already democracies. As far as I know 100% (1/1) of already rich democracies that then get rich from natural resources don’t become more autocratic. The base rate is great! (Though obviously it’s n=1 from a country that has an oil rent that is much less than 95% of its GDP.)
Reference class tennis, yay!
I guess I’m a bit confused why the emergent dynamics and the power-seeking are on different ends of the spectrum?
Like what do you even mean by emergent dynamics there? Are we talking about non-power seeking system, and in that case, what systems are non-power seeking?
I would claim that there is no system that is not power-seeking since any system that survives needs to do bayesian inference and therefore needs to minimize free energy. (Self-referencing here but whatever) hence any surviving system needs to power-seek, given power-seeking is attaining more causal control over the future.
So therefore, there is no future where there is no power-seeking system it is just that the thing that power-seeks acts over larger timespans and is more of a slow actor. The agentic attractor space is just not human flesh bag space nor traditional space, it is different yet still a power seeker.
Still, I do like what you say about the change in the dynamics and how power-seeking is maybe more about a shorter temporal scale? It feels like the y-axis should be that temporal axis instead since it seems to be more what you’re actually pointing at?
Thanks, agree that ‘emergent dynamics’ is woolly above.
I guess I don’t think the y-axis should be the temporal dimension. To give some cartoon examples:
I’d put an extremely Machievellian 10 year plan on the part of a cabal of politicians to backslide into a dictatorship then seize power over the rest of the world near the top end of the axis
I’d put unfavourable order of capabilities, where in an unplanned way superpersuasion comes online before defenses, and actors fail to coordinate not to deploy it because of competitive dynamics, near the bottom end of the axis. Even if the whole thing unfolds over a few months
I do think the y-axis is pretty correlated with temporal scales, but I don’t think it’s the same. I also don’t think physical violence is the same, though it’s probably also correlated (cf the backsliding example which is v powerseeking but not v violent).
The thing I had in mind was more like, should I imagine some actor consciously trying to bring power concentration about? To the extent that’s a good model, it’s power-seeking. Or should I imagine that no actor is consciously planning this, but the net result of the system is still extreme power concentration? If that’s a good model, it’s emergent dynamics.
Idk, I see that this is messy and probably there’s some other better concept here
Thank you for clarifying, I think I understand now!
I notice I was not that clear when writing my comment yesterday so I want to apologise for that.
I’ll give an attempt at restating what you said in other terms. There’s a concept of temporal depth in action plans. The question is to some extent, how many steps in the future are you looking similar to something else. A simple way of imagining this is how long in the future a chess bot can plan and how stockfish is able to plan basically 20-40 moves in advance.
It seems similar to what you’re talking about here in that the longer someone plans in the future, the more external attempts it avoids with regards to external actions.
Some other words to describe the general vibe might be planned vs unplanned or maybe centralized versus decentralized? Maybe controlled versus uncontrolled? I get the vibe better now though so thanks!
I think it’s the combination of a temporal axis and a (for a lack of a better term) physical violence axis.