AI definitely poses an existential risk, in the sense that it can generate models with the hidden (possibly undetectable?) intention of competing against humanity for resources. The more intelligent the model, the higher its chance of success!
The thought of an AI takeover is so scary that I won’t even try to imagine its possible implications; instead, I want to focus on other scenarios that are easier to predict and discuss.
Unsafety as a Deterrent
Nuclear War is the most notorious modern existential risk, but—strangely enough—even an existential risk can be leveraged: nuclear weapons serve as a powerful deterrent against attack when posed as a threat of retaliation.
It is to be expected that unsafety in AI will be employed in the same way, since agentic models are cheaper to develop than nuclear weapons and they will be highly effective in causing havoc (especially in the cloud) if untreated / unopposed.
That’s not the end of the story: it is also to be expected that mild (?) rogue AIs will be deployed on purpose in the real world, to fight competitors in the private sector, and enemies in the armed forces. The illegality of such actions won’t discourage some big players from trying. That is the scenario that I am going to discuss in this post.
A Personal Remark
I am not saying that I suggest developing unsafe AIs, nor that I am happy about such an idea! It’s quite the opposite. However, I foresee that the development of unsafe AIs will become routine in military establishments simply because there is incentive in using them as a cheap deterrent. My hope is that the UN will be able to become an AI monopoly and break the vicious cycle of “tactical unsafety”.
AI Proxy Wars
Imagine a world where all AI models are fully aligned with their developers’ intentions, but the developers’ intentions are hostile to other people. This world will see multiple groups of people competing with each other, and their AIs will act as proxies in their battle.
This is a world where rogue AIs are intentionally rogue, but only to enemy factions. This is also a world where alliances are forged and broken continuously—and yet diplomacy is completely decided by AI, not people!
This is mostly a zero-sum game that can be modelled as a probabilistic adversarial game in discrete time. The “resources” of the game can be imagined as controllable assets with specific monetary value (in discrete coins) and theft resistance, and the AIs will attempt to “steal” such resources from the opponents.
To steal a resource, an AI will engage another AI in a “match” whose result is probabilistic and depends on their relative intelligence; we are going to measure intelligence with an ELO rating[1].
Players “lose” the game when they don’t control resources anymore (likely because such resources were stolen)! Players “win” the game when they have no strong opponent left.
While the actual conflict has been modelled with a somehow simplistic game[2], I believe it still captures the meaningful characteristics of digital battles:
the fact that they are subject to chances (= they are probabilistic in nature)
that agents have about the same speed of thought and action (= time is discrete)
that it is easy to detect and react to a digital theft attempt (= no hidden information)
that some agents are more effective / intelligent than others (= AIs have ELO rating)
that agents can temporarily cooperate (= AIs can sum up their ELO ratings)
that some resources are more valuable / strategic than others (= resources have coin value)
that some resources are more difficult to obtain (= resources have theft resistance, aka they decrease the ELO rating of the attacker).
“Digital battles” are not the only type of aggressive actions that AIs will consider: for example, bombing a data centre may be a sound option in AI’s view, but I am assuming that it will be relatively uncommon to see such a thing due to the potential public backlash.
AI Quantity vs AI Quality
Going back to the original title of this post, we now have a context where it is possible to compare the following few alternative strategies:
clone a huge number of low-witted AI models
train and refine a few super-genius AI models.
Let’s simplify the game even further and consider N simple cooperative[3] AIs for player Alice matched against a single advanced AI for player Bob. Let p be the chance of success of stealing 1 coin for each simple AI, and q the chance of success of stealing C coins for the advanced AI. On average, the simple AIs will steal Np coins while the advanced AI will steal Cq coins. Therefore, the high-volume-AI strategy only works if N>>Cq/p while the state-of-the-art-AI strategy only works if C>>Np/q.
That seems to imply that both the strategies are valid when used in the correct context! What about a tactical perspective? Things look very complicated here: Alice has the advantage of using tactics that are simply unavailable to Bob, and vice versa. A few examples below.
The Blitzkrieg Tactic
If Alice succeeds in stealing most of the resources of Bob at the start of the match in one lucky strike, even if Bob was able to steal some coins in the meanwhile, he may not be able to pay the cost of running his full army at the next turns—thus effectively losing to Alice soon after. This tactic is not available to Bob since, statistically, his success variance is quite static due to high-volumes.
The Guerrilla Tactic
If Bob attacked most of Alice’s resources simultaneously, she would not be able to counter it effectively and—despite the total damage at each turn being low—it would be continuous and slowly degrading. If Alice does not stop Bob as soon as possible, she will eventually be swarmed and defeated. This tactic is not available to Alice since she is forced to focus her effort on a few single targets at a time.
There are many other tactics that we can discuss, but my point is that the two approaches (quantity vs quality) seem to offer both advantages and disadvantages in terms of both strategy and tactics—and, therefore, there is no clear winner.
As a side note, it would be very interesting to see a playable version of the game above and then train a RNN to master it.
Conclusion
Based on the discussion so far, it seems that having a few advanced AIs does not necessarily pay off in a war[4]. Similarly, having a high-volume of simple AIs can also be a losing proposition. It is quite possible that the best overall strategy is something in the middle, where you have at your disposal many tactics and you can also counter many others.
In the ideal world, we wouldn’t need to worry about AI being used in this way: but such a world must be built yet, and the current world points to a different direction.
Addenda
The “game” I presented in this post is designed with no hidden information, and all its implications are due to that choice. In a world with hidden information, advanced AIs are going to perform better because they can create complex traps. Note that there would still be some incentive in having a high-volume of simple AIs though (for example, to explore unknown places).
My name is Gianluca Calcagni, born in Italy, with a Master of Science in Mathematics. I am currently (2024) working in IT as a consultant with the role of Salesforce Certified Technical Architect. My opinions do not reflect the opinions of my employer or my customers. Feel free to contact me on Twitter or Linkedin.
Technically, TrueSkill is a better fit in respect to ELO: the main idea is to assign a Gaussian distribution to each agent, where the mean represents its empirical skill level and the variance represents the uncertainty about its real skill level.
There is still some level of ambiguity in the rules of this game, so I’d be happy if the community would help formalising them. The main problem is that it doesn’t take into account the cost of running some agents.
That does not mean that highly-intelligent AIs are not dangerous: they may still represent an existential risk for humanity, especially if undercover. My analysis is relevant only in the context presented in this post, where AI alignment is solved but used for aggression, and where there is no hidden information.
Can AI Quantity beat AI Quality?
AI definitely poses an existential risk, in the sense that it can generate models with the hidden (possibly undetectable?) intention of competing against humanity for resources. The more intelligent the model, the higher its chance of success!
The thought of an AI takeover is so scary that I won’t even try to imagine its possible implications; instead, I want to focus on other scenarios that are easier to predict and discuss.
Unsafety as a Deterrent
Nuclear War is the most notorious modern existential risk, but—strangely enough—even an existential risk can be leveraged: nuclear weapons serve as a powerful deterrent against attack when posed as a threat of retaliation.
It is to be expected that unsafety in AI will be employed in the same way, since agentic models are cheaper to develop than nuclear weapons and they will be highly effective in causing havoc (especially in the cloud) if untreated / unopposed.
That’s not the end of the story: it is also to be expected that mild (?) rogue AIs will be deployed on purpose in the real world, to fight competitors in the private sector, and enemies in the armed forces. The illegality of such actions won’t discourage some big players from trying. That is the scenario that I am going to discuss in this post.
A Personal Remark
I am not saying that I suggest developing unsafe AIs, nor that I am happy about such an idea! It’s quite the opposite. However, I foresee that the development of unsafe AIs will become routine in military establishments simply because there is incentive in using them as a cheap deterrent. My hope is that the UN will be able to become an AI monopoly and break the vicious cycle of “tactical unsafety”.
AI Proxy Wars
Imagine a world where all AI models are fully aligned with their developers’ intentions, but the developers’ intentions are hostile to other people. This world will see multiple groups of people competing with each other, and their AIs will act as proxies in their battle.
This is a world where rogue AIs are intentionally rogue, but only to enemy factions. This is also a world where alliances are forged and broken continuously—and yet diplomacy is completely decided by AI, not people!
This is mostly a zero-sum game that can be modelled as a probabilistic adversarial game in discrete time. The “resources” of the game can be imagined as controllable assets with specific monetary value (in discrete coins) and theft resistance, and the AIs will attempt to “steal” such resources from the opponents.
To steal a resource, an AI will engage another AI in a “match” whose result is probabilistic and depends on their relative intelligence; we are going to measure intelligence with an ELO rating[1].
Players “lose” the game when they don’t control resources anymore (likely because such resources were stolen)! Players “win” the game when they have no strong opponent left.
While the actual conflict has been modelled with a somehow simplistic game[2], I believe it still captures the meaningful characteristics of digital battles:
the fact that they are subject to chances (= they are probabilistic in nature)
that agents have about the same speed of thought and action (= time is discrete)
that it is easy to detect and react to a digital theft attempt (= no hidden information)
that some agents are more effective / intelligent than others (= AIs have ELO rating)
that agents can temporarily cooperate (= AIs can sum up their ELO ratings)
that some resources are more valuable / strategic than others (= resources have coin value)
that some resources are more difficult to obtain (= resources have theft resistance, aka they decrease the ELO rating of the attacker).
“Digital battles” are not the only type of aggressive actions that AIs will consider: for example, bombing a data centre may be a sound option in AI’s view, but I am assuming that it will be relatively uncommon to see such a thing due to the potential public backlash.
AI Quantity vs AI Quality
Going back to the original title of this post, we now have a context where it is possible to compare the following few alternative strategies:
clone a huge number of low-witted AI models
train and refine a few super-genius AI models.
Let’s simplify the game even further and consider N simple cooperative[3] AIs for player Alice matched against a single advanced AI for player Bob. Let p be the chance of success of stealing 1 coin for each simple AI, and q the chance of success of stealing C coins for the advanced AI. On average, the simple AIs will steal Np coins while the advanced AI will steal Cq coins. Therefore, the high-volume-AI strategy only works if N>>Cq/p while the state-of-the-art-AI strategy only works if C>>Np/q.
That seems to imply that both the strategies are valid when used in the correct context! What about a tactical perspective? Things look very complicated here: Alice has the advantage of using tactics that are simply unavailable to Bob, and vice versa. A few examples below.
The Blitzkrieg Tactic
If Alice succeeds in stealing most of the resources of Bob at the start of the match in one lucky strike, even if Bob was able to steal some coins in the meanwhile, he may not be able to pay the cost of running his full army at the next turns—thus effectively losing to Alice soon after. This tactic is not available to Bob since, statistically, his success variance is quite static due to high-volumes.
The Guerrilla Tactic
If Bob attacked most of Alice’s resources simultaneously, she would not be able to counter it effectively and—despite the total damage at each turn being low—it would be continuous and slowly degrading. If Alice does not stop Bob as soon as possible, she will eventually be swarmed and defeated. This tactic is not available to Alice since she is forced to focus her effort on a few single targets at a time.
There are many other tactics that we can discuss, but my point is that the two approaches (quantity vs quality) seem to offer both advantages and disadvantages in terms of both strategy and tactics—and, therefore, there is no clear winner.
As a side note, it would be very interesting to see a playable version of the game above and then train a RNN to master it.
Conclusion
Based on the discussion so far, it seems that having a few advanced AIs does not necessarily pay off in a war[4]. Similarly, having a high-volume of simple AIs can also be a losing proposition. It is quite possible that the best overall strategy is something in the middle, where you have at your disposal many tactics and you can also counter many others.
In the ideal world, we wouldn’t need to worry about AI being used in this way: but such a world must be built yet, and the current world points to a different direction.
Addenda
The “game” I presented in this post is designed with no hidden information, and all its implications are due to that choice. In a world with hidden information, advanced AIs are going to perform better because they can create complex traps. Note that there would still be some incentive in having a high-volume of simple AIs though (for example, to explore unknown places).
Further Links
Control Vectors as Dispositional Traits (my first post)
All the Following are Distinct (my second post)
An Opinionated Look at Inference Rules (my previous post).
Who I am
My name is Gianluca Calcagni, born in Italy, with a Master of Science in Mathematics. I am currently (2024) working in IT as a consultant with the role of Salesforce Certified Technical Architect. My opinions do not reflect the opinions of my employer or my customers. Feel free to contact me on Twitter or Linkedin.
Revision History
[2024-10-02] Post published.
[2024-10-03] Included Addenda.
Footnotes
Technically, TrueSkill is a better fit in respect to ELO: the main idea is to assign a Gaussian distribution to each agent, where the mean represents its empirical skill level and the variance represents the uncertainty about its real skill level.
There is still some level of ambiguity in the rules of this game, so I’d be happy if the community would help formalising them. The main problem is that it doesn’t take into account the cost of running some agents.
I am implicitly modelling such agents as independent identically distributed discrete random variables.
That does not mean that highly-intelligent AIs are not dangerous: they may still represent an existential risk for humanity, especially if undercover. My analysis is relevant only in the context presented in this post, where AI alignment is solved but used for aggression, and where there is no hidden information.