DLB comments on Gradient Descent on Token Input Embeddings