Thanks! It looks like they tried to interpret normal NNs by breaking them up into different order terms and used tensor diagrams as a tool. AFAIK, they didn’t use tensor-transformers (I only ctrl-f-ed “tensor” and “bilinear”, so could’ve missed it).
Though analyzing tensor transformers their way would also fail for the same reasons they brought up (ie exponential blow up of polynomial terms).
Redwood research did very similar experiments in 2022, but didn’t publish about them. They are briefly mentioned in this podcast: https://blog.redwoodresearch.org/p/the-inaugural-redwood-research-podcast.
Thanks! It looks like they tried to interpret normal NNs by breaking them up into different order terms and used tensor diagrams as a tool. AFAIK, they didn’t use tensor-transformers (I only ctrl-f-ed “tensor” and “bilinear”, so could’ve missed it).
Though analyzing tensor transformers their way would also fail for the same reasons they brought up (ie exponential blow up of polynomial terms).