Gurkenglas comments on Gradient routing is better than pretraining filtering

Gurkenglas 2 Sep 2025 17:31 UTC
4 points
0
Suppose I’m an aspiring gradient hacker that just woke up in a setup like yours. If I can tell whether gradients from the current run will be routed to me (e.g. via the canary strings), then I have a bunch of influence over the rest of the network, right? Like, whenever I am not being trained, I can cause problems that would be solved by circuits I’d like to exist, such as like-minded trading partners.