Eliezer is using some definition of “threat” that refers to “fairness”, such that “fair” actions do not count as threats
This seems likely. Much of Eliezer’s fiction includes a lot of typical mind fallacy and a seemingly-willful ignorance of power dynamics and “unfair” results in equilibria being the obvious outcome for unaligned agents with different starting conditions.
This kind of game-theory analysis is just silly unless it includes the information about who has the stronger/more-visible precommittments, and what extra-game impacts the actions will have. It’s actually quite surprising how deeply CDT is assumed (agents can freely choose their actions at the point in the narrative where it happens) in such analyses.
This seems likely. Much of Eliezer’s fiction includes a lot of typical mind fallacy and a seemingly-willful ignorance of power dynamics and “unfair” results in equilibria being the obvious outcome for unaligned agents with different starting conditions.
This kind of game-theory analysis is just silly unless it includes the information about who has the stronger/more-visible precommittments, and what extra-game impacts the actions will have. It’s actually quite surprising how deeply CDT is assumed (agents can freely choose their actions at the point in the narrative where it happens) in such analyses.