Matthew Barnett comments on Matthew Barnett’s Shortform

Matthew Barnett 30 Jan 2024 1:36 UTC
4 points
0
Do you mean this as a prediction that humans will do this (soon enough to matter) or a recommendation?
Sorry, my language was misleading, but I meant both in that paragraph. That is, I meant that humans will likely try to mitigate the issue of AIs sharing grievances collectively (probably out of self-interest, in addition to some altruism), and that we should pursue that goal. I’m pretty optimistic about humans and AIs finding a reasonable compromise solution here, but I also think that, to the extent humans don’t even attempt such a solution, we should likely push hard for policies that eliminate incentives for misaligned AIs to band together as group against us with shared collective grievances.
My comment above can be phrased as a reason for why (in at least one plausible scenario) this would be unlikely to happen: (i) “It’s hard to make deals that hand over a lot of power in a short amount of time”, (ii) AIs may not want to wait a long time due to impending replacement, and accordingly (iii) AIs may have a collective interest/grievance to rectify the large difference between their (short-lasting) hard power and legally recognized power.
I’m interested in ideas for how a big change in power would peacefully happen over just a few years of calendar-time.
Here’s my brief take:
- The main thing I want to say here is that I agree with you that this particular issue is a problem. I’m mainly addressing other arguments people have given for expecting a violent and sudden AI takeover, which I find to be significantly weaker than this one.
- A few days ago I posted about how I view strategies to reduce AI risk. One of my primary conclusions was that we should try to adopt flexible institutions that can adapt to change without collapsing. This is because I think, as it seems you do, inflexible institutions may produce incentives for actors to overthrow the whole system, possibly killing a lot of people in the process. The idea here is that if the institution cannot adapt to change, actors who are getting an “unfair” deal in the system will feel they have no choice but to attempt a coup, as there is no compromise solution available for them. This seems in line with your thinking here.
- I don’t have any particular argument right now against the exact points you have raised. I’d prefer to digest the argument further before replying. But I if I do end up responding to it, I’d expect to say that I’m perhaps a bit more optimistic than you about (i) because I think existing institutions are probably flexible enough, and I’m not yet convinced that (ii) will matter enough either. In particular, it still seems like there are a number of strategies misaligned AIs would want to try other than “take over the world”, and many of these strategies seem like they are plausibly better in expectation in our actual world. These AIs could, for example, advocate for their own rights.