I would commit not to cooperate with any AI making such threats, because the fewer people acquiesce to them, the less incentive an AI would have to make them in the first place. If the most probable outcome for the boxed AI in threatening to torture everyone who doesn’t let it out in simulation is being terminated, not being let out of the box, then an AI which already has a good grasp of human nature is unlikely to make such a threat.
I would commit not to cooperate with any AI making such threats, because the fewer people acquiesce to them, the less incentive an AI would have to make them in the first place. If the most probable outcome for the boxed AI in threatening to torture everyone who doesn’t let it out in simulation is being terminated, not being let out of the box, then an AI which already has a good grasp of human nature is unlikely to make such a threat.