Thanks for your comment! I think this is a nice idea. Also, it seems that if you keep track of how many times the threshold was applied along a trajectory of tokens, you could get a corresponding lower bound based on that number. I think this is likely to yield better results since you’d go from e.g. to , and you’d hope that if your baseline model is close to the one of interest, there are only a few “pivot” tokens driving the change in rollout distributions.
But note that this idea is more related to Importance Sampling than Logit Path Extrapolation. Because it basically amounts to a strategy to come up with baseline distributions to sample from. If instead you’re suggesting to interpolate the value of the cap and extrapolate beyond the observed range as we do above, I don’t see how the guarantees of this kind of interpolation are different from the ones in our post.
Thanks for your comment! I think this is a nice idea. Also, it seems that if you keep track of how many times the threshold was applied along a trajectory of tokens, you could get a corresponding lower bound based on that number. I think this is likely to yield better results since you’d go from e.g. to , and you’d hope that if your baseline model is close to the one of interest, there are only a few “pivot” tokens driving the change in rollout distributions.
and extrapolate beyond the observed range as we do above, I don’t see how the guarantees of this kind of interpolation are different from the ones in our post.
But note that this idea is more related to Importance Sampling than Logit Path Extrapolation. Because it basically amounts to a strategy to come up with baseline distributions to sample from. If instead you’re suggesting to interpolate the value of the cap
In any case, thanks again for your comment!