[Question] Supposing the 1bit LLM paper pans out

https://​​arxiv.org/​​abs/​​2402.17764 claims that 1 bit LLMs are possible.

If this scales, I’d imagine there is a ton of speedup to unlock since our hardware has been optimized for 1 bit operations for decades. What does this imply for companies like nvidia and the future of LLM inference/​training?

Do we get another leap in LLM capabilities? Do CPUs become more useful? And can this somehow be applied to make training more efficient?

Or is this paper not even worth considering for some obvious reason I can’t tell.

Edit: this method is applied to training already

No comments.