I agree that it’s good to try to answer the question, under what sort of reliability guarantee is my model optimal, and it’s worth making the optimization power vs robustness trade off explicit via toy models like the one you use above.
That being said, re: the overall approach. Almost every non degenerate regularization method can be thought of as “optimal” wrt some robust optimization problem (in the same way that non degenerate optimization can be trivially cast as Bayesian optimization) -- e.g. the RL—KL objective with respect to some π0 is optimal the following minimax problem:
for some ϵ>0. So the question is not so much “do we cap the optimization power of the agent” (which is a pretty common claim!) but “which way of regularizing agent policies more naturally captures the robust optimization problems we want solved in practice”.
Yep, agreed. Except I don’t understand how you got that equation from RL with KL penalties, can you explain that further?
I think the most novel part of this post is showing that this robust optimization problem (maximizing average utility while avoiding selection for upward errors in the proxy) is the one we want to solve, and that it can be done with a bound that is intuitively meaningful and can be determined without just guessing a number.
(It’s also worth noting that an important form of implicit regularization is the underlying capacity/capability of the model we’re using to represent the policy.)
Yeah I wouldn’t want to rely on this without a better formal understanding of it though. KL regularization I feel like I understand.
Yep, agreed. Except I don’t understand how you got that equation from RL with KL penalties, can you explain that further?
I think the most novel part of this post is showing that this robust optimization problem (maximizing average utility while avoiding selection for upward errors in the proxy) is the one we want to solve, and that it can be done with a bound that is intuitively meaningful and can be determined without just guessing a number.
Yeah I wouldn’t want to rely on this without a better formal understanding of it though. KL regularization I feel like I understand.