When I sent him the link to this comment, he replied:
ah i think you forgot the first term in the MIMI objective, I(s_t, x_t), which makes the mapping intuitive by maximizing information flow from the environment into the user. what you proposed was similar to optimizing only the second term, I(x_t, s_t+1 | s_t), which would indeed suffer from the problems that john mentions in his reply
When I sent him the link to this comment, he replied:
my imprecision may have mislead you :)