Agreement karma indicates agreement, separate from overall quality.
I’m a bit confused as to why this would work.
If the circuit in the intermediate layer that estimates the gradient does not influence the output, wouldn’t they just be free parameters that can be varied with no consequence to the loss? If so, this violates 2a since perturbing these parameters would not get the model to converge to the desired solution.
3 votes
Overall karma indicates overall quality.
0 votes
Agreement karma indicates agreement, separate from overall quality.
I’m a bit confused as to why this would work.
If the circuit in the intermediate layer that estimates the gradient does not influence the output, wouldn’t they just be free parameters that can be varied with no consequence to the loss? If so, this violates 2a since perturbing these parameters would not get the model to converge to the desired solution.
1 vote
Overall karma indicates overall quality.
0 votes
Agreement karma indicates agreement, separate from overall quality.
Good point. Could hardcode them, so those parameters aren’t free to vary at all.