Stuart_Armstrong comments on In partially observable environments, stochastic policies can be optimal

Stuart_Armstrong 19 Jul 2016 17:16 UTC
5 points
0
Yes, you can see this POMDP as a variant of the absent minded-driver, and get that result.