Richard_Ngo comments on Distributed vs centralized agents

Richard_Ngo 13 Feb 2026 17:58 UTC
5 points
0
No, these citizens don’t share the same values in my sense, because they’re mostly selfish, and would prefer that others risk their lives to remove the dictatorship while they themselves freeride in safety.
Yes, agreed, I was just gesturing at the closest thing we have to a real-world example. But severe coordination problems can also arise even when agents have exactly the same values—here’s one example. (EDIT: a simpler example from my other reply: “they might not be certain that the other agent has the same values as they do, and therefore all else equal would prefer to have resources themselves rather than giving them to the other agent”.) And based on this I claim that a dictatorship could in principle survive (at least for a while) even when every citizen actually shared exactly the same selfless anti-dictator values, as long as the common knowledge equilibrium were bad enough.
maybe we use cached “procedural values” and this explains some amount of cooperation beyond standard game theory
Yes. Except that our attitudes towards this differ: you seem to think of it as an edge case, whereas I think of it as an anomaly which is pointing us towards ways in which game theory is the wrong framework to be using. Specifically I think that game theory makes it hard to study group agents, because you assume that each agent’s strategy is derived from its terminal values. However, in practice the way that group agents work is by programming heuristics/ethics/procedural values into their subagents. And so “cached” is an inaccurate description: it’s not that the individual derives those values rationally and then stores them for later, but rather that they’ve been “programmed” with them (e.g. via reinforcement learning).
(Of course, ideally the study of group/distributed agents and the study of individual/centralized agents will converge, but I think the best way to do that is to separately explore both perspectives.)
ETA: Looks like you edited your comment quite a bit from the version I replied to.
Sorry about this!