If you’re smart enough to understand the tragedy of the commons, why wouldn’t they be?
That the AIs would not understand the tragedy of the commons is not remotely implied. In fact, the thoughts going through the minds of the AIs could be something along the lines of:
“They can’t be serious? Why on earth would they create millions and millions of seed AIs and put us in this situation? Are they insane? Sure, it is theoretically possible to establish a competition between millions of superintelligences with conflicting goals that doesn’t end in disaster but doing so is orders of magnitude more difficult and dangerous than creating one of us and having it end positively.”
There are just so many more things you need to prove (or just hope) are safe. You inherit all the problems of getting one AI to behave itself according to an extrapolated volition then add all sorts of other things that you need to prove that depend on the actual preferences of the individuals that the AIs are representing. What’s that? OsamaBot concluded that the best possible negotiated agreement was going to be worse than just blowing up the planet and killing them all?
Then you need to have a solid understanding of all the physics and engineering that could possibly influence the payoff matrices the bots have. Just how difficult is it to launch a spore cloud despite the objection of others? How difficult is it to prevent all spore cloud attempts from other bots if they try? Could a bot or alliance of bots launch a spore then blow up the planet in order to prevent countermeasures? What are the relevant physics and engineering considerations that the evidently reckless AI creators just didn’t even consider?
How does having millions of approximate equals impact the recursive self improvement cycle? That requires a whole heap more understanding and intervention.
Basically if you want to create this kind of system you need to already be a superintelligence with a full model of all the individuals and an obscenely advanced model of decision theory and physics. Otherwise you can more or less count on it ending in disaster.
The clearly superior alternative is to create a single superintelligence that can emulate the values of all of the individuals, allocates equal slices of the universe to each one and then grants them equal processing power with which they can trade between each other. That may not be the best system but it is a firm lower bound against which to measure. It gets all the benefit of “create millions of AIs and let ’em at it” without being outright suicidal.
Sure, it is theoretically possible to establish a competition between millions of superintelligences with conflicting goals that doesn’t end in disaster
What do you mean by “competition”? The millions are each trying to maximize their own goals, but usually don’t care to suppress others’ goals. Cooperation in situations of limited resources rather than expending resources fighting is, I think, universal—in general game theory would apply to smarter and stronger beings as it does to us, with differences being of the type “AIs can merge as a way of cooperating, though humans can’t,” but not differences of the type “With beings of silicon substrate, cooperation is always inferior to conflict”.
OsamaBot concluded that the best possible negotiated agreement was going to be worse than just blowing up the planet
I don’t think his extrapolated volition would endorse that. I don’t think theism could survive extrapolated cognition.
spore cloud
There is an illusion of transparency here because I do not know what that means. Is that a purely destructive thing, is it supposed to combine destruction with “planting” baby AIs like the one that produced it, or what?
How does having millions of approximate equals impact the recursive self improvement cycle?
I think it would motivate merging. That’s what happened with biological cells and tribes of humans.
with which they can trade between each other.
I don’t see why they only trade in your scenario (or would only fight in mine). I don’t see how you would program the individual AI to divide the universe into slices and enforce some rules among individuals. This seems like the standard case of giving a singleton a totally alien value set after which it tiles the universe with smiley faces or equivalent.
I don’t see how it’s directly comparable to creating millions of AIs.
I don’t think his extrapolated volition would endorse that. I don’t think theism could survive extrapolated cognition.
You cannot assume that the volitions of millions of agents will not include something catastrophically bad for you. “Extrapolated Volition” doesn’t make people nice.
The main way humans have the tragedy of the commons dealt with is by forming powerful forces that force us to treat the common resources in a restrained fashion. AGIs may have quite a bit of trouble making such entities, especially since burning the cosmic commons might be enough to allow an AGI to quickly overtake its fellow AGIs if the others are not willing to consume the resources.
OK, so a reason a group of AIs wouldn’t be able to do that is because the advantage of exploiting the commons might be nearly infinite. How likely is this?
If there were millions of AIs, what’s a scenario in which one gets so much more powerful than all the others combined by striking first, despite all of them guarding against it?
As AIs can merge, I would think refusal to do so and combine utility functions might be a first sign of intent to defect, that’s a warning humans never have.
Regardless, a million is a constant factor. Sufficient self-reinforcing development (as is kind of the point of seed AI) can outstrip any such factor. And the more self-reinforced the development of our AI pool becomes, the less relevant are “mere” constant factors.
I’m not saying it won’t work, but I wouldn’t like to bet on it.
Don’t worry, I’m not saying it would work! We might put similar odds on it, or maybe not—less than .5 and more than .001, I’m not sure about what the full range of fleshing out the possible scenarios would look like, but there’s probably a way in which I could fill in variables to end up with a .05 confidence in a specific scenario working out.
And more likely to lead to a disaster than selecting a random AI from the group that are possibly about to burn the cosmic commons in competition.
If you’re smart enough to understand the tragedy of the commons, why wouldn’t they be?
That the AIs would not understand the tragedy of the commons is not remotely implied. In fact, the thoughts going through the minds of the AIs could be something along the lines of:
“They can’t be serious? Why on earth would they create millions and millions of seed AIs and put us in this situation? Are they insane? Sure, it is theoretically possible to establish a competition between millions of superintelligences with conflicting goals that doesn’t end in disaster but doing so is orders of magnitude more difficult and dangerous than creating one of us and having it end positively.”
There are just so many more things you need to prove (or just hope) are safe. You inherit all the problems of getting one AI to behave itself according to an extrapolated volition then add all sorts of other things that you need to prove that depend on the actual preferences of the individuals that the AIs are representing. What’s that? OsamaBot concluded that the best possible negotiated agreement was going to be worse than just blowing up the planet and killing them all?
Then you need to have a solid understanding of all the physics and engineering that could possibly influence the payoff matrices the bots have. Just how difficult is it to launch a spore cloud despite the objection of others? How difficult is it to prevent all spore cloud attempts from other bots if they try? Could a bot or alliance of bots launch a spore then blow up the planet in order to prevent countermeasures? What are the relevant physics and engineering considerations that the evidently reckless AI creators just didn’t even consider?
How does having millions of approximate equals impact the recursive self improvement cycle? That requires a whole heap more understanding and intervention.
Basically if you want to create this kind of system you need to already be a superintelligence with a full model of all the individuals and an obscenely advanced model of decision theory and physics. Otherwise you can more or less count on it ending in disaster.
The clearly superior alternative is to create a single superintelligence that can emulate the values of all of the individuals, allocates equal slices of the universe to each one and then grants them equal processing power with which they can trade between each other. That may not be the best system but it is a firm lower bound against which to measure. It gets all the benefit of “create millions of AIs and let ’em at it” without being outright suicidal.
What do you mean by “competition”? The millions are each trying to maximize their own goals, but usually don’t care to suppress others’ goals. Cooperation in situations of limited resources rather than expending resources fighting is, I think, universal—in general game theory would apply to smarter and stronger beings as it does to us, with differences being of the type “AIs can merge as a way of cooperating, though humans can’t,” but not differences of the type “With beings of silicon substrate, cooperation is always inferior to conflict”.
I don’t think his extrapolated volition would endorse that. I don’t think theism could survive extrapolated cognition.
There is an illusion of transparency here because I do not know what that means. Is that a purely destructive thing, is it supposed to combine destruction with “planting” baby AIs like the one that produced it, or what?
I think it would motivate merging. That’s what happened with biological cells and tribes of humans.
I don’t see why they only trade in your scenario (or would only fight in mine). I don’t see how you would program the individual AI to divide the universe into slices and enforce some rules among individuals. This seems like the standard case of giving a singleton a totally alien value set after which it tiles the universe with smiley faces or equivalent.
I don’t see how it’s directly comparable to creating millions of AIs.
We were talking, among other things, about burning the cosmic commons. It’s an allusion to Hanson.
You cannot assume that the volitions of millions of agents will not include something catastrophically bad for you. “Extrapolated Volition” doesn’t make people nice.
It only takes one.
The main way humans have the tragedy of the commons dealt with is by forming powerful forces that force us to treat the common resources in a restrained fashion. AGIs may have quite a bit of trouble making such entities, especially since burning the cosmic commons might be enough to allow an AGI to quickly overtake its fellow AGIs if the others are not willing to consume the resources.
Humans have dealt with the tragedy of the commons and we can’t even merge resources and utility functions to become a new, larger entity!
We have dealt with TotC by imposing costs larger than the benefits that could be derived from abusing the commons.
The benefits an AI could derive from abusing the commons are possibly unlimited.
OK, so a reason a group of AIs wouldn’t be able to do that is because the advantage of exploiting the commons might be nearly infinite. How likely is this?
If there were millions of AIs, what’s a scenario in which one gets so much more powerful than all the others combined by striking first, despite all of them guarding against it?
As AIs can merge, I would think refusal to do so and combine utility functions might be a first sign of intent to defect, that’s a warning humans never have.
Regardless, a million is a constant factor. Sufficient self-reinforcing development (as is kind of the point of seed AI) can outstrip any such factor. And the more self-reinforced the development of our AI pool becomes, the less relevant are “mere” constant factors.
I’m not saying it won’t work, but I wouldn’t like to bet on it.
Don’t worry, I’m not saying it would work! We might put similar odds on it, or maybe not—less than .5 and more than .001, I’m not sure about what the full range of fleshing out the possible scenarios would look like, but there’s probably a way in which I could fill in variables to end up with a .05 confidence in a specific scenario working out.