It seems that one of the goals of religion is to put humans in a state of epistemic uncertainty about the payoff structure of their current game. Relatedly, your setup seems to imply that the AI is in a state of very high epistemic certainty.
I’m not sure about how high the state of epistemic uncertainty needs to be, but you are correct that there is epistemic uncertainty for all parties. Given a probabilistic action filter, it is uncertain whether any particular action will entail the destruction of humanity, and this is common knowledge. I am not the first or only one to propose epistemic uncertain on the part of the AI with respect to goals and actions. See Stuart Russell: https://arxiv.org/abs/2106.10394
It seems that one of the goals of religion is to put humans in a state of epistemic uncertainty about the payoff structure of their current game. Relatedly, your setup seems to imply that the AI is in a state of very high epistemic certainty.
I’m not sure about how high the state of epistemic uncertainty needs to be, but you are correct that there is epistemic uncertainty for all parties. Given a probabilistic action filter, it is uncertain whether any particular action will entail the destruction of humanity, and this is common knowledge. I am not the first or only one to propose epistemic uncertain on the part of the AI with respect to goals and actions. See Stuart Russell: https://arxiv.org/abs/2106.10394