1a3orn comments on Thomas Kwa’s Shortform

1a3orn 4 Feb 2026 15:22 UTC
7 points
2

The AI player also had AIs with different system prompts frequently come into conflict themselves.

My expectation is that for future AIs, as today, many of the goals of an AI will come from the scaffolding / system prompt rather than from the weights directly—and the “goals” from the Constitution / model spec act more as limiters / constraints on a mostly prompt or scaffolding-specified goal.

So in my mainline, expect a large number (thousands / millions, per today) of goal-separate “AIs” which are at _identical intelligence levels rather than a 1 or a or a handful (~20) of AIs, because same weights still amount to different AIs with different goals.

I’m happy to see this was (somewhat?) reflected in having AIs with different system prompts? But I don’t know how much that aspect was pushed out -- 1-4 AIs with different system prompts still feels like a pretty steep narrowing of the number of goals I’d expect to see in the world. I don’t know how much the wargame pushes on this, but a more decentralized run would be interesting to me!

At one point during the game, datacenters were destroyed with non-nuclear missile strikes, removing much of the world’s compute stock.

Yeah I think Taiwan being taken looks rather likely and rather relevant for all but the steepest, jerkiest SIE, and appears insufficiently accounted for.
- Daniel Kokotajlo 4 Feb 2026 17:34 UTC
  7 points
  0
  Parent
  The game in question was about as decentralized as you expect, I think? But, importantly, compute is very unevenly distributed. The giant army of AIs running on OpenAI’s datacenters all have the same system prompt essentially (like, maybe there are a few variants, but they are all designed to work smoothly together towards OpenAI’s goals) and that army constitutes 20% of the total population of AIs initially and at one point in the game a bit more than 50%.
  
  So while (in our game) there were thousands/millions of different AI factions/goals of similar capability level, the top 5 AI factions/goals by population size / compute level controlled something like 90% of the world’s compute, money, access-to-powerful-humans, etc. So to a first approximation, it’s reasonable to model the world as containing 1-4 AI factions, plus a bunch of miscellaneous minor AIs that can get up to trouble and shout warnings from the sidelines but don’t wield significant power.
  
  If you are interested in playing a game sometime, you’d be welcome to join! I’d encourage you to make your own variant scenario too if you like.
- Thomas Kwa 4 Feb 2026 17:38 UTC
  4 points
  2
  Parent
  So in my mainline, expect a large number (thousands / millions, per today) of goal-separate “AIs” which are at _identical intelligence levels rather than a 1 or a or a handful (~20) of AIs, because same weights still amount to different AIs with different goals.
  We did attempt to model this. During one phase of the game, the frontier model wasn’t deployed publicly and so one AI with nearly 50% of the world’s compute was far ahead of the rest, but during another phase the general public had access to frontier AIs. The general public’s AIs didn’t end up changing the outcome much but it could have if a pause happened. Obviously it’s limited resolution because you can’t actually list millions of goals.