It isn’t obvious to me that this kind of big-O analysis matters that much in practice. RNNs will still need the ability to recall any information from the entire gene
I agree it is unclear, but I would still expect it to. Just because.
Its very easy for models to remember stuff in general
Like the residual stream of a transformer is empirically highly redundant
Does it actually need to remember all the stuff? Humans don’t remember all the stuff.
I agree it is unclear, but I would still expect it to. Just because.
Its very easy for models to remember stuff in general
Like the residual stream of a transformer is empirically highly redundant
Does it actually need to remember all the stuff? Humans don’t remember all the stuff.