The “million token” recurrent memory transformer was first published July 2022. The new paper is just an investigation whether the method can also be used for BERT-like encoder models.
Given that there was a ton of papers that “solved” the quadratic bottleneck I wouldn’t hold my breath.
The “million token” recurrent memory transformer was first published July 2022. The new paper is just an investigation whether the method can also be used for BERT-like encoder models.
Given that there was a ton of papers that “solved” the quadratic bottleneck I wouldn’t hold my breath.