Oh yeah sorry that isn’t shown there. But I believe the sum over all timesteps of the m-step expected info gain at each timestep is finite w.p.1 which would make it o(1/t) w.p.1.
Oh yeah sorry that isn’t shown there. But I believe the sum over all timesteps of the m-step expected info gain at each timestep is finite w.p.1 which would make it o(1/t) w.p.1.