I agree, but it is important to note that the authors of the paper disagree here.
(It’s somewhat hard for me to tell if the crux is more that they don’t expect that everyone would get AI aligned to them (at least as representatives) even if this was technical feasible with zero alignment tax or if the crux is that even if everyone had single-single aligned corrigible AIs representing their interests and with control over their assets and power that would still result in disempowerment. I think it is more like second thing here.)
So Zvi is accurately representing the perspective of the authors, I just disagree with them.
Yes, Ryan is correct. Our claim is that even fully-aligned personal AI representatives won’t necessarily be able to solve important collective action problems in our favor. However, I’m not certain about this. The empirical crux for me is: Do collective action problems get easier to solve as everyone gets smarter together, or harder?
As a concrete example, consider a bunch of local polities in a literal arms race. If each had their own AGI diplomats, would they be able to stop the arms race? Or would the more sophisticated diplomats end up participating in precommitment races or other exotic strategies that might still prevent a negotiated settlement? Perhaps the less sophisticated diplomats would fear that a complicated power-sharing agreement would lead to their disempowerment eventually anyways, and refuse to compromise?
As a less concrete example, our future situation might be analogous to a population of monkeys who unevenly have access to human representatives which earnestly advocate on their behalf. There is a giant, valuable forest that the monkeys live in next to a city where all important economic activity and decision-making happens between humans. Some of the human population (or some organizations, or governments) end up not being monkey-aligned, instead focusing on their own growth and security. The humans advocating on behalf of monkeys can see this is happening, but because they can’t always participate directly in wealth generation as well as independent humans, they eventually become a small and relatively powerless constituency. The government and various private companies regularly bid or tax enormous amounts of money for forest land, and even the monkeys with index funds eventually are forced to sell, and then go broke from rent.
I admit that there are many moving parts of this scenario, but it’s the closest simple analogy to what I’m worried about that I’ve found so far. I’m happy for people to point out ways this analogy won’t match reality.
I agree, but it is important to note that the authors of the paper disagree here.
(It’s somewhat hard for me to tell if the crux is more that they don’t expect that everyone would get AI aligned to them (at least as representatives) even if this was technical feasible with zero alignment tax or if the crux is that even if everyone had single-single aligned corrigible AIs representing their interests and with control over their assets and power that would still result in disempowerment. I think it is more like second thing here.)
So Zvi is accurately representing the perspective of the authors, I just disagree with them.
Yes, Ryan is correct. Our claim is that even fully-aligned personal AI representatives won’t necessarily be able to solve important collective action problems in our favor. However, I’m not certain about this. The empirical crux for me is: Do collective action problems get easier to solve as everyone gets smarter together, or harder?
As a concrete example, consider a bunch of local polities in a literal arms race. If each had their own AGI diplomats, would they be able to stop the arms race? Or would the more sophisticated diplomats end up participating in precommitment races or other exotic strategies that might still prevent a negotiated settlement? Perhaps the less sophisticated diplomats would fear that a complicated power-sharing agreement would lead to their disempowerment eventually anyways, and refuse to compromise?
As a less concrete example, our future situation might be analogous to a population of monkeys who unevenly have access to human representatives which earnestly advocate on their behalf. There is a giant, valuable forest that the monkeys live in next to a city where all important economic activity and decision-making happens between humans. Some of the human population (or some organizations, or governments) end up not being monkey-aligned, instead focusing on their own growth and security. The humans advocating on behalf of monkeys can see this is happening, but because they can’t always participate directly in wealth generation as well as independent humans, they eventually become a small and relatively powerless constituency. The government and various private companies regularly bid or tax enormous amounts of money for forest land, and even the monkeys with index funds eventually are forced to sell, and then go broke from rent.
I admit that there are many moving parts of this scenario, but it’s the closest simple analogy to what I’m worried about that I’ve found so far. I’m happy for people to point out ways this analogy won’t match reality.
zero alignment tax seems less than 50% likely to me