It sure does feel like a powerful technique! We haven’t explored much how to generalize it yet, though.
At the time, we were thinking about the optimization problem “max (the one error) subject to (constraint on other errors)”, and what the curve looks like which gives the max value as a function of the constraint errors. One (of many) angles I tried was to consider ways of transforming a latent, which would move it from one point in the feasible set to another point in the feasible set. And once I asked that question, basically the first thing I tried was the transformation in the proof which just scales down all the errors.
At that point we had already done the Hellinger distances thing (also among many other things), on the general principle of “try it in the second order regime before trying to prove globally”, so it was just a matter of connecting the pieces together.
It sure does feel like a powerful technique! We haven’t explored much how to generalize it yet, though.
At the time, we were thinking about the optimization problem “max (the one error) subject to (constraint on other errors)”, and what the curve looks like which gives the max value as a function of the constraint errors. One (of many) angles I tried was to consider ways of transforming a latent, which would move it from one point in the feasible set to another point in the feasible set. And once I asked that question, basically the first thing I tried was the transformation in the proof which just scales down all the errors.
At that point we had already done the Hellinger distances thing (also among many other things), on the general principle of “try it in the second order regime before trying to prove globally”, so it was just a matter of connecting the pieces together.