Vivek Hebbar comments on Understanding and controlling a maze-solving policy network