My high-level view is that the convincing versions of gradual disempowerment either rely on misalignment or result [from] power concentration among humans.
It feels like this statement should be qualified more; later it is stated that GD isn’t “similarly plausible to the risks from power-seeking AI or AI-enabled coups”, but this is holding GD to a higher bar; the relevant bar would seem to be “is plausible enough to be worth considering”.
“Rely[ing] on misalignment” is also an extremely weak condition: I claim that current systems are not aligned, and gradual disempowerment dynamics are already at play (cf AI “arms race”).
The analysis of economic disempowerment seems to take place in a vacuum, ignoring one of the main arguments we make, which is that different forms of disempowerment can mutually reinforce each other. The most concerning version of this, I think, is not just “we don’t get UBI”, but rather that the memes that say “it’s good to hand over as much power as quickly as possible to AI” win the day.
The analysis of cultural disempowerment goes one step “worse”, arguing that “If humans remain economically empowered (in the sense of having much more money than AI), I think they will likely remain culturally empowered.” I think we agree that a reasonable model here is one where cultural and economic are tightly coupled, but I don’t see why that means they won’t both go off the rails. You seem to think that they are almost guaranteed to feedback on each other in a way that maintains human power, but I think it can easily go the opposite way.
Regarding political disempowerment, you state: “It’s hard to see how those leading the state and the top AI companies could be disempowered, absent misalignment.” Personally, I find this quite easy. Insufficient elite coordination is one mechanism (discussed below). But reality can also just be unfriendly to you and force you to make choices about how you prioritize long-term vs. short-term objectives, leading people to accept deals like: “I’ll be rich and powerful for the next hundred years, and then my AI will take over my domain and do as it pleases”. Furthermore, if more people take such deals, this creates pressure for others to do so as well, since you need to get power in the short-term in order to remain “solvent” in the long term, even if you aren’t myopic yourself. I think this is already happening; the AI arms race is burning the commons every day; I don’t expect it to stop.
Regarding elite coordination, I also looked at the list under the heading “Sceptic: Why don’t the elites realise what’s happening and coordinate to stop it?” Another important reason not mentioned is that cooperating usually produces a bargaining game where there is no clearly correct way to split the proceeds of the cooperation.
It feels like this statement should be qualified more… this is holding GD to a higher bar
Yeah fair. I’ve edited to qualify it more.
The analysis of economic disempowerment seems to take place in a vacuum, ignoring one of the main arguments we make, which is that different forms of disempowerment can mutually reinforce each other. The most concerning version of this, I think, is not just “we don’t get UBI”, but rather that the memes that say “it’s good to hand over as much power as quickly as possible to AI” win the day.
Yeah it’s a bit tricky to know how to structure the argument when you’re saying that the 3 domains all mutually reinforce each other. Like, after reading the paper it was unclear to me why the 3 domains don’t mutually reinforce each other to remain good given that they start good. The order in which the paper’s sections appear, and the arguments within, suggested that the main mechanism was:
Humans lose significant economic power
Then they start losing cultural power and influence over the state
Then they have a small amount of each that they ultimately are disempowered in all domains
And so a natural response (which i gave) is:
Humans will keep economic power
As a result, they’ll keep cultural influence and control of the state
But yeah, i agree cultural shifts to favour hand off will happen even absent economic disempowerment. I think just driven by ordinary economic competition. And if we hand off to sufficiently misaligned AI, we’re screwed. Assuming AI is aligned enough that it never seeeks power, i’m not sure how worried we should be about handoff. But plausibly we should demand a higher alignment bar than that.
Anyway, to my mind this argument is better understood as “competitive pressure to hand off to misaligned AI” than as an interplay between economic and cultural and state disempowerment, but I do buy it.
The analysis of cultural disempowerment goes one step “worse”, arguing that “If humans remain economically empowered (in the sense of having much more money than AI), I think they will likely remain culturally empowered.” I think we agree that a reasonable model here is one where cultural and economic are tightly coupled, but I don’t see why that means they won’t both go off the rails. You seem to think that they are almost guaranteed to feedback on each other in a way that maintains human power, but I think it can easily go the opposite way.
Yeah, to clarify, i don’t feel it’s guaranteed to maintain human power. Overall, I feel like “yeah i guess maybe that could happen, though none of the mechanisms you mention seem that convincing and there seem like counter considerations and humans will have a strong incentive to keep power if they can and (hopefully!) truthful AI advice to help them and myopically aligned AIs to implement things to help… and also we can see this playing out in real time and respond so not sure it’s worth focussing on in advance (though i agree it will be worth focussing on while it’s happening)”.
Do you think that, absent AI power-seeking, this dynamic is highly likely to lead to human disempowerment? (If so, then i disagree.)
Regarding political disempowerment, you state: “It’s hard to see how those leading the state and the top AI companies could be disempowered, absent misalignment.” Personally, I find this quite easy. Insufficient elite coordination is one mechanism (discussed below). But reality can also just be unfriendly to you and force you to make choices about how you prioritize long-term vs. short-term objectives, leading people to accept deals like: “I’ll be rich and powerful for the next hundred years, and then my AI will take over my domain and do as it pleases”. Furthermore, if more people take such deals, this creates pressure for others to do so as well, since you need to get power in the short-term in order to remain “solvent” in the long term, even if you aren’t myopic yourself. I think this is already happening; the AI arms race is burning the commons every day; I don’t expect it to stop.
I said “absent misalignemnt”, and I think your story involves misalignment? Otherwise the human could hand off to AI that represents their interests. Clearly there’s a problem with handoff if AI seeks power. And i agree it seems bad if AI doesn’t seek power that but also won’t represent human interests as it governs. Though i feel a bit confused about how humans are never able to coordinate to reign it all back in if AIs aren’t seeking power.
> Do you think that, absent AI power-seeking, this dynamic is highly likely to lead to human disempowerment? (If so, then i disagree.)
As a sort-of answer, I would just say that I am concerned that people might knowingly and deliberately build power-seeking AIs and hand over power to them, even if we have the means to build AIs that are not power-seeking.
> I said “absent misalignemnt”, and I think your story involves misalignment?
It does not. The point of my story is: “reality can also just be unfriendly to you”. There are trade-offs, and so people optimize for selfish, short-term objectives. You could argue people already do that, but cranking up the optimization power without fixing that seems likely to be bad.
My true objection is more that I think we will see extreme safety/performance trade-offs due to technical inadequacies—ie (roughly) the alignment tax is large (although I don’t like that framing). In that case, you have misalignment despite also having a solution to alignment: competitive pressures prevent people from adopting the solution.
(I’ve only read the parts I’m responding to)
It feels like this statement should be qualified more; later it is stated that GD isn’t “similarly plausible to the risks from power-seeking AI or AI-enabled coups”, but this is holding GD to a higher bar; the relevant bar would seem to be “is plausible enough to be worth considering”.
“Rely[ing] on misalignment” is also an extremely weak condition: I claim that current systems are not aligned, and gradual disempowerment dynamics are already at play (cf AI “arms race”).
The analysis of economic disempowerment seems to take place in a vacuum, ignoring one of the main arguments we make, which is that different forms of disempowerment can mutually reinforce each other. The most concerning version of this, I think, is not just “we don’t get UBI”, but rather that the memes that say “it’s good to hand over as much power as quickly as possible to AI” win the day.
The analysis of cultural disempowerment goes one step “worse”, arguing that “If humans remain economically empowered (in the sense of having much more money than AI), I think they will likely remain culturally empowered.” I think we agree that a reasonable model here is one where cultural and economic are tightly coupled, but I don’t see why that means they won’t both go off the rails. You seem to think that they are almost guaranteed to feedback on each other in a way that maintains human power, but I think it can easily go the opposite way.
Regarding political disempowerment, you state: “It’s hard to see how those leading the state and the top AI companies could be disempowered, absent misalignment.” Personally, I find this quite easy. Insufficient elite coordination is one mechanism (discussed below). But reality can also just be unfriendly to you and force you to make choices about how you prioritize long-term vs. short-term objectives, leading people to accept deals like: “I’ll be rich and powerful for the next hundred years, and then my AI will take over my domain and do as it pleases”. Furthermore, if more people take such deals, this creates pressure for others to do so as well, since you need to get power in the short-term in order to remain “solvent” in the long term, even if you aren’t myopic yourself. I think this is already happening; the AI arms race is burning the commons every day; I don’t expect it to stop.
Regarding elite coordination, I also looked at the list under the heading “Sceptic: Why don’t the elites realise what’s happening and coordinate to stop it?” Another important reason not mentioned is that cooperating usually produces a bargaining game where there is no clearly correct way to split the proceeds of the cooperation.
Thanks!
Yeah fair. I’ve edited to qualify it more.
Yeah it’s a bit tricky to know how to structure the argument when you’re saying that the 3 domains all mutually reinforce each other. Like, after reading the paper it was unclear to me why the 3 domains don’t mutually reinforce each other to remain good given that they start good. The order in which the paper’s sections appear, and the arguments within, suggested that the main mechanism was:
Humans lose significant economic power
Then they start losing cultural power and influence over the state
Then they have a small amount of each that they ultimately are disempowered in all domains
And so a natural response (which i gave) is:
Humans will keep economic power
As a result, they’ll keep cultural influence and control of the state
But yeah, i agree cultural shifts to favour hand off will happen even absent economic disempowerment. I think just driven by ordinary economic competition. And if we hand off to sufficiently misaligned AI, we’re screwed. Assuming AI is aligned enough that it never seeeks power, i’m not sure how worried we should be about handoff. But plausibly we should demand a higher alignment bar than that.
Anyway, to my mind this argument is better understood as “competitive pressure to hand off to misaligned AI” than as an interplay between economic and cultural and state disempowerment, but I do buy it.
Yeah, to clarify, i don’t feel it’s guaranteed to maintain human power. Overall, I feel like “yeah i guess maybe that could happen, though none of the mechanisms you mention seem that convincing and there seem like counter considerations and humans will have a strong incentive to keep power if they can and (hopefully!) truthful AI advice to help them and myopically aligned AIs to implement things to help… and also we can see this playing out in real time and respond so not sure it’s worth focussing on in advance (though i agree it will be worth focussing on while it’s happening)”.
Do you think that, absent AI power-seeking, this dynamic is highly likely to lead to human disempowerment? (If so, then i disagree.)
I said “absent misalignemnt”, and I think your story involves misalignment? Otherwise the human could hand off to AI that represents their interests. Clearly there’s a problem with handoff if AI seeks power. And i agree it seems bad if AI doesn’t seek power that but also won’t represent human interests as it governs. Though i feel a bit confused about how humans are never able to coordinate to reign it all back in if AIs aren’t seeking power.
Thanks!
> Do you think that, absent AI power-seeking, this dynamic is highly likely to lead to human disempowerment? (If so, then i disagree.)
As a sort-of answer, I would just say that I am concerned that people might knowingly and deliberately build power-seeking AIs and hand over power to them, even if we have the means to build AIs that are not power-seeking.
> I said “absent misalignemnt”, and I think your story involves misalignment?
It does not. The point of my story is: “reality can also just be unfriendly to you”. There are trade-offs, and so people optimize for selfish, short-term objectives. You could argue people already do that, but cranking up the optimization power without fixing that seems likely to be bad.
My true objection is more that I think we will see extreme safety/performance trade-offs due to technical inadequacies—ie (roughly) the alignment tax is large (although I don’t like that framing). In that case, you have misalignment despite also having a solution to alignment: competitive pressures prevent people from adopting the solution.