johnswentworth comments on The Commitment Races problem

johnswentworth 23 Aug 2019 2:08 UTC
LW: 8 AF: 5
−2
AF
One big factor this whole piece ignores is communication channels: a commitment is completely useless unless you can credibly communicate it to your opponent/partner. In particular, this means that there isn’t a reason to self-modify to something UDT-ish unless you expect other agents to observe that self-modification. On the other hand, other agents can simply commit to not observing whether you’ve committed in the first place—effectively destroying the communication channel from their end.
In a game of chicken, for instance, I can counter the remove-the-steering-wheel strategy by wearing a blindfold. If both of us wear a blindfold, then neither of us has any reason to remove the steering wheel. In principle, I could build an even stronger strategy by wearing a blindfold and using a beeping laser scanner to tell whether my opponent has swerved—if both players do this, then we’re back to the original game of chicken, but without any reason for either player to remove their steering wheel.
- Daniel Kokotajlo 23 Aug 2019 2:45 UTC
  LW: 5 AF: 3
  2
  AF Parent
  I think in the acausal context at least that wrinkle is smoothed out.
  In a causal context, the situation is indeed messy as you say, but I still think commitment races might happen. For example, why is [blindfold+laserscanner] a better strategy than just blindfold? It loses to the blindfold strategy, for example. Whether or not it is better than blindfold depends on what you think the other agent will do, and hence it’s totally possible that we could get a disastrous crash (just imagine that for whatever reason both agents think the other agent will probably not do pure blindfold. This can totally happen, especially if the agents don’t think they are strongly correlated with each other and sometimes even if they do (e.g. if they use CDT)) The game of chicken doesn’t cease being a commitment race when we add the ability to blindfold and the ability to visibly attach laserscanners.
  - johnswentworth 23 Aug 2019 5:47 UTC
    LW: 7 AF: 3
    0
    AF Parent
    Blindfold + scanner does not necessarily lose to blindfold. The blindfold does not prevent swerving, it just prevents gaining information—the blindfold-only agent acts solely on its priors. Adding a scanner gives the agent more data to work with, potentially allowing the agent to avoid crashes. Foregoing the scanner doesn’t actually help unless the other player knows I’ve foregone the scanner, which brings us back to communication—though the “communication” at this point may be in logical time, via simulation.
    In the acausal context, communication kicks even harder, because either player can unilaterally destroy the communication channel: they can simply choose to not simulate the other player. The game will never happen at all unless both agents expect (based on priors) to gain from the trade.
    - Daniel Kokotajlo 26 Aug 2019 4:55 UTC
      LW: 8 AF: 2
      2
      AF Parent
      If you choose not to simulate the other player, then you can’t see them, but they can still see you. So it’s destroying one direction of the communication channel. But the direction that remains (they seeing you) is the dimension most relevant for e.g. whether or not there is a difference between making a commitment and credibly communicating it to your partner. Not simulating the other player is like putting on a blindfold, which might be a good strategy in some contexts but seems kinda like making a commitment: you are committing to act on your priors in the hopes that they’ll see you make this commitment and then conform their behavior to the incentives implied by your acting on your priors.