Now for the key idea: we’re going to compare the distribution of states X achieved by the demon with policy π, to the distribution of states X which would be achieved by the demon if it took the same distribution of actions completely independent of its observations—i.e. if it just blindly tried to sort the molecules without looking at them.
Interesting! I’ve previously looked at this method as a solid definition of “optimization” (and Utility functions and whatnot) but I never thought of applying it to Maxwell’s Demon.
Interesting! I’ve previously looked at this method as a solid definition of “optimization” (and Utility functions and whatnot) but I never thought of applying it to Maxwell’s Demon.