leogao comments on It Is Reasonable To Research How To Use Model Internals In Training

leogao 9 Feb 2026 7:38 UTC
LW: 12 AF: 7
3
AF
I think the infrastructure changes required are pretty straightforward, at least within the reference class of changes on large ML codebases in general. like it would take me at most a few days to implement. if done in a reasonable way, it also seems very low risk for breaking other things if done right (the probe uses trivial memory and compute, so you wouldn’t expect it to substantially interact systems wise). regularly retraining the probe is also not really that bad imo.