TurnTrout comments on Training Process Transparency through Gradient Interpretability: Early experiments on toy language models