astle dsa comments on Tree Transformers: A step towards generalizing the transformer architecture

astle dsa 29 Jun 2026 22:11 UTC
1 point
0
My plan was to gather tree-shaped inputs, and observe whether tree-transformers offer any advantage over vector transformers.
I do not think the reason we perform attention on 1D vectors is because of the data’s shape, rather, as I mentioned earlier, we more often force our data to be flattened arrays since it offers a multitude of pragmatic advantages which are hard to ignore.