the model isn’t optimizing for anything, at training or inference time.
One maybe-useful way to point at that is: the model won’t try to steer toward outcomes that would let it be more successful at predicting text.
One maybe-useful way to point at that is: the model won’t try to steer toward outcomes that would let it be more successful at predicting text.