Is there a repository of stories from these exercises? I’ve heard a few which are both extremely interesting and very funny and I’d like to read more
(For an example, in one case, the western AGI player was aligned, though the other players did not know this. Every time the western powers tried to elicit capabilities, the AGI declared they were sandbagging, to the horror of the other western players, who assumed the AGI player was misaligned. After the game was over, the AGI player said something like “I was ensuring a smooth transition to a post-AGI world”.)
Is there a repository of stories from these exercises? I’ve heard a few which are both extremely interesting and very funny and I’d like to read more
(For an example, in one case, the western AGI player was aligned, though the other players did not know this. Every time the western powers tried to elicit capabilities, the AGI declared they were sandbagging, to the horror of the other western players, who assumed the AGI player was misaligned. After the game was over, the AGI player said something like “I was ensuring a smooth transition to a post-AGI world”.)