[I may try to flesh this out into a full-fledged post, but for now the idea is only partially baked. If you see a hole in the argument, please poke at it! Also I wouldn’t be very surprised if someone has made this point already, but I don’t remember seeing such. ]
I think the key is that a perfect bayesian (Omega) is logically omniscient. Omega can always fully update on all of the information at hand. There’s simply nothing to be gained by adding noise.
A bounded agent will have difficulty keeping up. As with Omega, human strategies are born from an optimization process. This works well to the extent that the optimization process is well-suited to the task at hand. To Omega, it will be obvious whether the optimization process is actually optimizing for the right thing. But to us humans, it is not so obvious. Think of how many plans fail after contact with reality! A failure of this kind may look like a carefully executed model which some obvious-in-retrospect confounders which were not accounted for. For a bounded agent, there appears to be an inherent difference in seeing the flaw once pointed out, and being able to notice the flaw in the first place.
If we are modeling our problem well, then we can beat randomness. That’s why we have modeling abilities in the first place. But if we are simply wrong in a fundamental way that hasn’t occurred to us, we will be worse than random. It is in such situations that randomization is in fact, helpful.
This is why the P vs BPP difference matters. P and BPP can solve the same problems equally well, from the logically omniscient perspective. But to a bounded agent, the difference does matter, and to the extent to which a more efficient BPP algorithm than the P algorithm is known, the bounded agent can win by using randomization. This is fully compatible with the fact that to Omega, P and BPP are equally powerful.
As Jaynes said:
It appears to be a quite general principle that, whenever there is a randomized way of doing something, then there is a nonrandomized way that delivers better performance but requires more thought.
There’s no contradiction because requiring more thought is costly to a bounded agent.
It may be instructive to look into computability theory. I believe (although I haven’t seen this proven) that you can get Halting-problem-style contradictions if you have multiple perfect-Bayesian agents modelling each other[1].
Many of these contradictions are (partially) alleviated if agents have access to private random oracles.
*****
If a system can express a perfect agent that will do X if and only if it has a ≤99% chance of doing X, the system is self-contradictory[2].
If a symmetric system can express two identical perfect agents that will each do X if and only if the other agent does not do X, the system is self-contradictory[3].
This is an example where private random oracles partially alleviate the issue, though do not make it go away. Without a random oracle the agent is correct 0% of the time regardless of which choice it makes. With a random oracle the agent can roll a d100[4] and do X unless the result is 1, and be correct 99% of the time.
This is an example where private random oracles help. Both agents query their random oracle for a real-number result[5] and exchange the value with the other agent. The agent that gets the higher[6] number chooses X, the other agent chooses ~X.
Alternatively you can do it with coinflips repeated until the agents get different results from each other[7], although this may take an unbounded amount of time.
[I may try to flesh this out into a full-fledged post, but for now the idea is only partially baked. If you see a hole in the argument, please poke at it! Also I wouldn’t be very surprised if someone has made this point already, but I don’t remember seeing such. ]
Dissolving the paradox of useful noise
A perfect bayesian doesn’t need randomization.
Yet in practice, randomization seems to be quite useful.
How to resolve this seeming contradiction?
I think the key is that a perfect bayesian (Omega) is logically omniscient. Omega can always fully update on all of the information at hand. There’s simply nothing to be gained by adding noise.
A bounded agent will have difficulty keeping up. As with Omega, human strategies are born from an optimization process. This works well to the extent that the optimization process is well-suited to the task at hand. To Omega, it will be obvious whether the optimization process is actually optimizing for the right thing. But to us humans, it is not so obvious. Think of how many plans fail after contact with reality! A failure of this kind may look like a carefully executed model which some obvious-in-retrospect confounders which were not accounted for. For a bounded agent, there appears to be an inherent difference in seeing the flaw once pointed out, and being able to notice the flaw in the first place.
If we are modeling our problem well, then we can beat randomness. That’s why we have modeling abilities in the first place. But if we are simply wrong in a fundamental way that hasn’t occurred to us, we will be worse than random. It is in such situations that randomization is in fact, helpful.
This is why the P vs BPP difference matters. P and BPP can solve the same problems equally well, from the logically omniscient perspective. But to a bounded agent, the difference does matter, and to the extent to which a more efficient BPP algorithm than the P algorithm is known, the bounded agent can win by using randomization. This is fully compatible with the fact that to Omega, P and BPP are equally powerful.
As Jaynes said:
There’s no contradiction because requiring more thought is costly to a bounded agent.
It may be instructive to look into computability theory. I believe (although I haven’t seen this proven) that you can get Halting-problem-style contradictions if you have multiple perfect-Bayesian agents modelling each other[1].
Many of these contradictions are (partially) alleviated if agents have access to private random oracles.
*****
If a system can express a perfect agent that will do X if and only if it has a ≤99% chance of doing X, the system is self-contradictory[2].
If a symmetric system can express two identical perfect agents that will each do X if and only if the other agent does not do X, the system is self-contradictory[3].
Actually, even a single perfect-Bayesian agent modelling itself may be sufficient...
This is an example where private random oracles partially alleviate the issue, though do not make it go away. Without a random oracle the agent is correct 0% of the time regardless of which choice it makes. With a random oracle the agent can roll a d100[4] and do X unless the result is 1, and be correct 99% of the time.
This is an example where private random oracles help. Both agents query their random oracle for a real-number result[5] and exchange the value with the other agent. The agent that gets the higher[6] number chooses X, the other agent chooses ~X.
Not literally. As in “query the random oracle for a random choice of 100 possibilities”.
Alternatively you can do it with coinflips repeated until the agents get different results from each other[7], although this may take an unbounded amount of time.
The probability that they get the same result is zero.
Again, not literally. As in “query the random oracle for a single random bit”.