Is there a "critical threshold" for LLM scaling laws?

[Question] Is there a “critical threshold” for LLM scaling laws?

Lots of important phenomena have a critical threshold. In nuclear weapons, a certain number of neutrons are produced by each fission event and some of those trigger more events. If the number of events triggered is slightly more, the result grows exponential. If slightly less, much less happens.

Similarly in Quantum Computing. Current computers struggle with quantum noise, which causes the superposition to break down over time. However, if we can keep the error rate low enough, it should be possible to use error-correcting codes to do arbitrarily complicated calculations.

When trying to extend LLMs to difficult multi-step problems, I often feel like I’m dealing with a similar phenomena. For example, if asking an LLM to write a novel, it will follow the plot of of the novel for a while and then spontaneously jump to a different story. It feels like the “amount of information” passed from one state to enough is not-quite-enough to keep the story going indefinitely. LLM Agents struggle with similar problems where they seem to work for a while, but after a while they get stuck in a loop or lose their train of thought.

It seems like there are two ways this behavior could change as we scale up LLMs:

LLMs get gradually better as we increase their capabilities (they go from being able write 1 page to writing 2 to writing 3...)
There is some “critical size” threshold above which agents are able to self-improve without limit and suddenly we go from writing pages to writing entire encyclopedias.

Does anyone know of good evidence for/against either of these cases? (the strongest evidence in favor of 1 seems to be “that’s how it’s gone so far”)