Mateusz Bagiński comments on Attribution-based parameter decomposition

Mateusz Bagiński 25 Jan 2025 20:30 UTC
3 points
0
Then we train to match the original model’s output by minimising an MSE loss $L_{minimality} = MSE (f (x, θ^{*}), f (x, κ (x)))$
I think you wanted
$MSE (f (x | θ^{*}), f (x | κ (x)))$