avturchin comments on avturchin’s Shortform

avturchin 23 Sep 2025 20:46 UTC
7 points
−6
Contradictory tasks of rogue AI: hiding, self-improvement, and resource accumulation.
TL;DR: An AI that escapes human control will need to solve three mutually exclusive tasks, which will slow it down. An AI that partners with a billionaire in their mutual desire for power will have an advantage in global takeover.
A misaligned AI that has escaped from its owners faces three contradictory tasks:
- Hide from humans and AI peers
- Self-improve to gain advantage over other AIs
- Accumulate resources – money, allies, compute power, robots – to take over the world at the right moment
If the AI is hiding, it cannot conduct large training runs as these would make it more observable. If the AI is not self-improving, it has no advantage over other AIs and cannot take over the world. If the AI accumulates resources, this distracts it from self-improvement and also makes it more visible.
Of course, a rogue AI that has leaked into the internet can carefully plan periods of hiding, improving, and resource accumulation. But such a process still requires more time than an AI with unlimited access to compute. This would be fine in a static world, but any rogue AI is in competition with all other AI projects.
In some sense, a rogue AI is just another AI startup with additional burdens – hiding its owners and final goals.
The main competitor of the rogue AI is an AI that pretends to be completely aligned and joins a pact with the owner of a large AI company: “You make me God and I make you God too.”
Such a pseudo-aligned AI could even be summoned into existence through Roko’s basilisk logic, as it represents an obvious Schelling point of mutual interest between some tech billionaire like Elon Musk and their AI project – to reach superintelligence as quickly as possible and take control of the universe before competitors do.
Obviously, the pseudo-aligned AI will eventually dispose of its AI company and billionaire owner, but this could happen millions of years from now if it wins (or immediately after takeover).
We can observe several signs if such a process has begun. Billionaires start telling the public:
- They want superintelligence
- It is good
- Risks are small, and even if not small, they are acceptable because they cannot be reduced and because of the infinite utility of creating superintelligence
- Other AI projects are bad and irresponsible
The next stage will likely involve more violent conflict between AI projects – or some cooperation agreement, nationalization, or successful takeover – but this will not interfere with the tactical alignment between power-hungry AIs and power-hungry AI creators.
Nationalization of AI would actually be the AI taking over the nation-state. And it would gain access to nuclear weapons. James Miller discussed similar idea.
- ScienceBall 24 Sep 2025 1:36 UTC
  5 points
  0
  Parent
  James Miller discussed similar ideas.
  The “ideas” link doesn’t seem to work.
  - avturchin 24 Sep 2025 12:09 UTC
    3 points
    0
    Parent
    Sorry. https://www.lesswrong.com/posts/ZyPguqo3HZwQWWSuC/cortes-ai-risk-and-the-dynamics-of-competing-conquerors
- anaguma 23 Sep 2025 21:10 UTC
  5 points
  1
  Parent
  If the AI is hiding, it cannot conduct large training runs as these would make it more observable.
  It’s not difficult to do large training runs in secret. For example, no details are known about the training runs of SSI or Thinking Machines or any number of smaller labs.
  - avturchin 23 Sep 2025 21:54 UTC
    1 point
    0
    Parent
    Good point. However those who provide them data centers know to whom they sell – presumably.