lilkim2025 comments on Backyard cat fight shows Schelling points preexist language

lilkim2025 14 Jan 2026 19:41 UTC
2 points
0
Relevant to this is that reinforcement learning models can ‘negotiate’ with each other even in scenarios where communication channels are restricted and the opponent is unknown (preventing a pseudo-language of goal-irrelevant low-cost actions from emerging).
As an example, a model trained on no-press Diplomacy, which was in desperate need of an alliance with another power to its West, took an action that destroyed one of its armies to no gain, when said army was in a position to damage the power it was attempting to ally with.
- tryhard1000 15 Jan 2026 9:16 UTC
  5 points
  0
  Parent
  Interesting! What’s the source for the second paragraph?
  - lilkim2025 16 Jan 2026 14:27 UTC
    1 point
    0
    Parent
    I think I read it in one of the papers Cicero cited, but that was a few years back, so I unfortunately don’t have a link.