Noosphere89 comments on My model of what is going on with LLMs

Noosphere89 17 Feb 2025 17:41 UTC
2 points
−2
For an algorithmic advance that might be relevant, this paper has a new model that apparently scales without any currently known bound, and it’s a recurrent architecture that actually works.

This is moderately spooky, both because of the fact that it works at all, combined with the fact that it does work being a signal for researchers to try to improve the architecture, and given the funding to AI, lots of money might come soon to these approaches:

https://arxiv.org/abs/2502.05171