williawa comments on 1a3orn’s Shortform

williawa 12 Feb 2026 13:24 UTC
1 point
0
Another argument is that you can more cleanly backprop through it.
A third argument is that you have constant inference memory and speed as a function of context length. At least if implemented like traditional rnns.