Tao Lin comments on Tensor-Transformer Variants are Surprisingly Performant

Tao Lin 16 Jan 2026 21:42 UTC
5 points
0
Redwood research did very similar experiments in 2022, but didn’t publish about them. They are briefly mentioned in this podcast: https://blog.redwoodresearch.org/p/the-inaugural-redwood-research-podcast.
- Logan Riggs 8 Apr 2026 21:38 UTC
  4 points
  0
  Parent
  Thanks! It looks like they tried to interpret normal NNs by breaking them up into different order terms and used tensor diagrams as a tool. AFAIK, they didn’t use tensor-transformers (I only ctrl-f-ed “tensor” and “bilinear”, so could’ve missed it).
  Though analyzing tensor transformers their way would also fail for the same reasons they brought up (ie exponential blow up of polynomial terms).