buybuydandavis comments on In partially observable environments, stochastic policies can be optimal