Tao Lin comments on Tensor-Transformer Variants are Surprisingly Performant