Chris_Leong comments on Towards Deconfusing Gradient Hacking

Chris_Leong 30 Jan 2023 10:43 UTC
LW: 2 AF: 1
0
AF
I thought this was a really important point, although I might be biased because I was finding it confusing how some discussions were talking about the gradient landscape as though it could be modified and not clarifying the source of this (for example, whether they were discussing reinforcement learning).
First off, the base loss landscape of the entire model is a function $Θ \to R$ that’s the same across all training steps, and the configuration of the weights selects somewhere on this loss landscape. Configuring the weights differently can put the model on a different spot on this landscape, but it can’t change the shape of the landscape itself.
Note that this doesn’t contradict the interpretation of the gradient hacker as having control over the loss landscape through subjunctive dependence. As an analogy, in Newcomb’s problem even if you accept that there is subjunctive dependence of the contents of the box on your decision and conclude you should one-box, it’s still true that the contents of the box cannot change after Omega has set them up and that there is no causal dependence of the contents of the box on your action, even though the dominated action argument no longer holds because of the subjunctive dependence.