Gram_Stone comments on In partially observable environments, stochastic policies can be optimal