Some argue that even without misaligned AI, humanity could lose control of societal systems simply by delegating more and more to AI. They would delegate because these future AIs are more capable and faster than humans, and because competitive dynamics pushing everyone to delegate further, until eventually humans have no control over these societal systems.
Delegation ≠ loss of control, though. A principal can delegate to an agent while maintaining control and seeing what the agent does. CEOs and managers do this all the time obviously. So to go from “strong incentive to delegate” to “loss of control”, you may need to also argue that humans will be unable to meaningfully oversee what the AIs do, e.g., because those AIs are too fast and their actions are too complicated for humans to understand. (Again, we’re assuming these AIs are intent-aligned, so modulo information, humans can retain control over the AIs.)
I guess to me it isn’t at all obvious that all humans would in fact delegate everything to AIs when that means giving up meaningful control. First, there may well exist methods to better aggregate and abstract information for humans so that they can understand enough of what the AIs are doing. Second, most humans would probably be reluctant to give up meaningful control when delegating—e.g., a CEO would likely be more reluctant to delegate a task or role if they have reason to think they will have no insight into how it’s done, or no ability to meaningfully control the employee—and this seems like it should move the equilibrium away a bit from “delegate everything”, even with competitive pressure. But unless all humans do so delegate, some humans will retain meaningful control over the AIs, and arguments about gradual disempowerment look more like arguments about concentration of power.
If CEOs (and boards) are also AIs, the analogy breaks. Humans are currently necessary in such positions, their necessity is sufficient to explain the fact that they are there at all, even if there are other reasons it might be a good thing. The situation changes when a system won’t break down without humans in positions of power, it’s not clear that these other reasons have any teeth in practice.
This doesn’t need to be the case, but only in a sense similar to how humanity doesn’t need to build AGIs before it’s ready. It’s a new affordance, and there is a danger that it gets used irresponsibly and leads to bad outcomes. There should be some understanding of how specifically this won’t be happening.
For a legally constituted corporation, the role of CEO is not only one of decision-maker, but also blame-taker: if the company goes into decline, the CEO can be fired; if the company does a sufficiently serious crime, the CEO can be prosecuted and punished (think Jeffrey Skilling of Enron). The presence of a human whose reputation (and possibly freedom) depend on the business’s conduct, conveys some trustworthiness to other humans (investors, trading partners, creditors).
If a company has an AI agent for its top-level decision-maker, then those decisions are made without this kind of responsibility for the outcome. An AI agent cannot meaningfully be fired or punished; it can be turned off, and some chatbot characters sometimes act to avoid such a fate; but I don’t think investors would be wise to count on that.
Now, what about a non-legally-constituted entity, or even a criminal one? Criminal gangs do rely on a big boss to adjudicate disputes, set strategy, and to risk taking a fall if things go sour. But online criminal groups like ransomware gangs or darknet marketplaces might be able to rest their reputation solely on performance rather than on the ability for a human big boss to fall or be punished. I don’t know enough about the sociology of these groups to say.
The question with intent alignment is: intent aligned with whom? If the AI executive is intent aligned with (follows orders from) the government, and the human government is voluntarily replaced with an AI government, we are left with an AI that is intent aligned with another AI.
Some argue that even without misaligned AI, humanity could lose control of societal systems simply by delegating more and more to AI. They would delegate because these future AIs are more capable and faster than humans, and because competitive dynamics pushing everyone to delegate further, until eventually humans have no control over these societal systems.
Delegation ≠ loss of control, though. A principal can delegate to an agent while maintaining control and seeing what the agent does. CEOs and managers do this all the time obviously. So to go from “strong incentive to delegate” to “loss of control”, you may need to also argue that humans will be unable to meaningfully oversee what the AIs do, e.g., because those AIs are too fast and their actions are too complicated for humans to understand. (Again, we’re assuming these AIs are intent-aligned, so modulo information, humans can retain control over the AIs.)
I guess to me it isn’t at all obvious that all humans would in fact delegate everything to AIs when that means giving up meaningful control. First, there may well exist methods to better aggregate and abstract information for humans so that they can understand enough of what the AIs are doing. Second, most humans would probably be reluctant to give up meaningful control when delegating—e.g., a CEO would likely be more reluctant to delegate a task or role if they have reason to think they will have no insight into how it’s done, or no ability to meaningfully control the employee—and this seems like it should move the equilibrium away a bit from “delegate everything”, even with competitive pressure. But unless all humans do so delegate, some humans will retain meaningful control over the AIs, and arguments about gradual disempowerment look more like arguments about concentration of power.
If CEOs (and boards) are also AIs, the analogy breaks. Humans are currently necessary in such positions, their necessity is sufficient to explain the fact that they are there at all, even if there are other reasons it might be a good thing. The situation changes when a system won’t break down without humans in positions of power, it’s not clear that these other reasons have any teeth in practice.
This doesn’t need to be the case, but only in a sense similar to how humanity doesn’t need to build AGIs before it’s ready. It’s a new affordance, and there is a danger that it gets used irresponsibly and leads to bad outcomes. There should be some understanding of how specifically this won’t be happening.
For a legally constituted corporation, the role of CEO is not only one of decision-maker, but also blame-taker: if the company goes into decline, the CEO can be fired; if the company does a sufficiently serious crime, the CEO can be prosecuted and punished (think Jeffrey Skilling of Enron). The presence of a human whose reputation (and possibly freedom) depend on the business’s conduct, conveys some trustworthiness to other humans (investors, trading partners, creditors).
If a company has an AI agent for its top-level decision-maker, then those decisions are made without this kind of responsibility for the outcome. An AI agent cannot meaningfully be fired or punished; it can be turned off, and some chatbot characters sometimes act to avoid such a fate; but I don’t think investors would be wise to count on that.
Now, what about a non-legally-constituted entity, or even a criminal one? Criminal gangs do rely on a big boss to adjudicate disputes, set strategy, and to risk taking a fall if things go sour. But online criminal groups like ransomware gangs or darknet marketplaces might be able to rest their reputation solely on performance rather than on the ability for a human big boss to fall or be punished. I don’t know enough about the sociology of these groups to say.
The question with intent alignment is: intent aligned with whom? If the AI executive is intent aligned with (follows orders from) the government, and the human government is voluntarily replaced with an AI government, we are left with an AI that is intent aligned with another AI.