Noosphere89 comments on Thane Ruthenis’s Shortform

Noosphere89 25 Jan 2025 20:29 UTC
4 points
0
I buy that 1 and 4 is the case, combined with Deepseek probably being satisfied that GPT-4-level models were achieved.
Edit: I did not mean to imply that GPT-4ish neighbourhood is where LLM pretraining plateaus at all, @Thane Ruthenis.