How so? The AI lives in a universe where people are planning to fuse AIs in the way described here. Given this website, and the knowledge that one believes that the red wire is magic, there is a high probability that the red wire is fake, and some very small probability that the wire is real. But it is also known for certain that the wire is real. There is not even a contradiction here.
Giving a wrong prior is not the same as walking up to the AI and telling it a lie (which should never raise probability to 1).
I meant certainly as in “I have an argument for it, so I am certain.”
Claim: Describing some part of space to “contain a human” and its destruction is never harder than describing a goal which will ensure every part of space which “contains a human” is treated in manner X for a non-trivial X (where X will usually be “morally correct”, whatever that means). (Non-trivial X means: Some known action A of the AI exists which will not treat a space volume in manner X).
The assumption that the action A is known is reasonably for the problem of friendly AI, as a sufficiently torturous killing can be constructed for every moral system we might wish to include into the AI, to have the killing labeled immoral.
Proof: Describing destruction of every agent in a certain part of space is easy: Remove all mass and all energy within that part of space. We need to find a way to select those parts of space which “contain a human”. However we have (via the assumption) that our goal function will go to negative infinity when evaluating a plan which treats a volume of space “containing a human” in violation of manner X. Assume for now that we find some way !X to violate manner X for a given space volume. By pushing through the goal evaluation every space volume in existence together with a plan to do !X, we will detect at least those space volumes which “contain a human”.
This leaves us with the problem of defining !X. The assumption as it stands already requires some A which can be used as !X.