Relevant to this is that reinforcement learning models can ‘negotiate’ with each other even in scenarios where communication channels are restricted and the opponent is unknown (preventing a pseudo-language of goal-irrelevant low-cost actions from emerging).
As an example, a model trained on no-press Diplomacy, which was in desperate need of an alliance with another power to its West, took an action that destroyed one of its armies to no gain, when said army was in a position to damage the power it was attempting to ally with.
Relevant to this is that reinforcement learning models can ‘negotiate’ with each other even in scenarios where communication channels are restricted and the opponent is unknown (preventing a pseudo-language of goal-irrelevant low-cost actions from emerging).
As an example, a model trained on no-press Diplomacy, which was in desperate need of an alliance with another power to its West, took an action that destroyed one of its armies to no gain, when said army was in a position to damage the power it was attempting to ally with.
Interesting! What’s the source for the second paragraph?
I think I read it in one of the papers Cicero cited, but that was a few years back, so I unfortunately don’t have a link.