While those are not examples of computer use, I think it fits the bill for a presentation of multi-agent capabilities in a visual way.
I’m happy to see that you are creating recaps for journalists and social media.
Regarding the comment on advocacy, “I think it also has some important epistemic challenges”: I’m not going to deny that in a highly optimized slide deck, you won’t have time to balance each argument. But also, does it matter that much? Rationality is winning, and to win, we need to be persuasive in a limited amount of time. I don’t have the time to also fix civilizational inadequacy regarding epistemics, so I play the game, as is doing the other side.
Also, I’m not criticizing the work itself, but rather the justification or goal. I think that if you did the goal factoring, you could optimize for this more directly.
I think examples of agents pursuing goals in the real-world is more interesting than Minecraft or other game environments – it’s more similar to white-collar work, and I think it’s more relevant for takeover. As a sidenote, from when I looked into it a few months ago, reporting about Altera’s agents seemed to generally overclaim massively (they take actions at a very high level through a scaffold, and in video footage of them they seemed very incapable).
I was thinking about this:
Perhaps this link is relevant: https://www.fanaticalfuturist.com/2024/12/ai-agents-created-a-minecraft-civilisation-complete-with-culture-religion-and-tax/ (it’s not a research paper, but neither you I think?)
Voyager is a single agent, but it’s very visual: https://voyager.minedojo.org/
OpenAI already did the hide-and-seek project a while ago: https://openai.com/index/emergent-tool-use/
While those are not examples of computer use, I think it fits the bill for a presentation of multi-agent capabilities in a visual way.
I’m happy to see that you are creating recaps for journalists and social media.
Regarding the comment on advocacy, “I think it also has some important epistemic challenges”: I’m not going to deny that in a highly optimized slide deck, you won’t have time to balance each argument. But also, does it matter that much? Rationality is winning, and to win, we need to be persuasive in a limited amount of time. I don’t have the time to also fix civilizational inadequacy regarding epistemics, so I play the game, as is doing the other side.
Also, I’m not criticizing the work itself, but rather the justification or goal. I think that if you did the goal factoring, you could optimize for this more directly.
Let’s chat in person !
Looking forward to chatting!
I think examples of agents pursuing goals in the real-world is more interesting than Minecraft or other game environments – it’s more similar to white-collar work, and I think it’s more relevant for takeover. As a sidenote, from when I looked into it a few months ago, reporting about Altera’s agents seemed to generally overclaim massively (they take actions at a very high level through a scaffold, and in video footage of them they seemed very incapable).