Here’s a safety assumption that might come up: “This prior over possible laws of physics is no more confused by these sensory inputs than the Solomonoff prior.”. Why would we want this? If we even just want to maximize diamonds, we can’t identify the representation of diamonds in the Solomonoff prior. If we use a prior over all possible atomic physics models, we can say that the amount of diamond is the amount of carbon atoms covalently bound to four other carbon atoms. If experiments later indicate quantum physics, the prior might desperately postulate a giant atomic computer that runs a simulation of a quantum physics universe. A diamond maximizer would then try to hack the simulation to rearrange the computer into diamonds. This query could tell us to keep looking for more general priors.
Here’s a safety assumption that might come up: “This prior over possible laws of physics is no more confused by these sensory inputs than the Solomonoff prior.”. Why would we want this? If we even just want to maximize diamonds, we can’t identify the representation of diamonds in the Solomonoff prior. If we use a prior over all possible atomic physics models, we can say that the amount of diamond is the amount of carbon atoms covalently bound to four other carbon atoms. If experiments later indicate quantum physics, the prior might desperately postulate a giant atomic computer that runs a simulation of a quantum physics universe. A diamond maximizer would then try to hack the simulation to rearrange the computer into diamonds. This query could tell us to keep looking for more general priors.