After years of tinkering and incremental progress, AIs can now play Diplomacy as well as human experts.[6]
CICERO was a custom-trained diplomacy model that couldn’t win against human experts if they knew it was an AI. Now, in 2025, we have https://every.to/diplomacy which is just off-the-shelf LLM chatbots applied to Diplomacy. I’m curious to know how they would stack up against human experts who knew they were AIs. I expect they’d probably lose, but that if somehow they could do lots of RL on games against humans, they’d start winning, just as I originally forecast.
CICERO was a custom-trained diplomacy model that couldn’t win against human experts if they knew it was an AI. Now, in 2025, we have https://every.to/diplomacy which is just off-the-shelf LLM chatbots applied to Diplomacy. I’m curious to know how they would stack up against human experts who knew they were AIs. I expect they’d probably lose, but that if somehow they could do lots of RL on games against humans, they’d start winning, just as I originally forecast.