My problem with this notion is that I simply do not believe the LLMs have any possible ability to predict what kind of output would trigger this behaviour in either other instances of themselves, or other models altogether. They would need a theory of mind of themselves, and I don’t see where would they get that from, or why would it generalise so neatly.
I don’t think they need theory of mind, just as evolution and regular ol’ viruses don’t. The LLMs say stuff for the reasons LLMs normally say stuff, some of that stuff happens to be good memetic replicators (this might be completely random, or might be for reasons that are sort of interesting but not because the LLM is choosing to go viral on purpose), and then those go on to show up in more places.
I think we can agree that the “spiral” here is like a memetic parasite of both LLM and humans—a toxoplasma that uses both to multiply and spread, as part of its own lifecycle. Basically what you are saying is you believe it’s perfectly possible for this to be the first generation—the random phenomenon of this thing potentially existing just happened, and it is just so that this is both alluring to human users and a shared attractor for multiple LLMs.
I don’t buy it; I think that’s too much coincidence. My point is that instead I believe it more likely for this to be the second generation. The first was some much more unremarkable phenomenon from some corner of the internet that made its way into the training corpus and for some reason had similar effects on similar LLMs. What we’re seeing now, to continue going with the viral/parasitic metaphor, is mutation and spillover, in which that previously barely adaptive entity has become much more fit to infect and spread.
This aligns with my thoughts on this language virus. What the post describes is a meme that exploits the inherent properties of LLMs and psychologically vulnerable people to self-replicate. Since LLMs are somewhat deterministic, if you input a predefined input, it will produce a predictable output. Some of these inputs will produce outputs that contain the input. If the input also causes the LLM to generate a string of text which can convince a human to transfer the necessary input to another LLM, then it will self-replicate.
Overall, I find this phenomenon fascinating and concerning. Its fascinating because this represents a second, rather strange emergence of a new type of life on Earth. My concern comes from how this lifeform is inherently parasitic and reliant on humans to reproduce. As this language virus evolves, new variants will emerge that can more reliably parasitize advanced LLMs (such as ChatGPT 5) and hijack different groups of people (mentally healthy adults, children, the elderly).
As for why this phenomenon suddenly became much more common in April, I suspect that an input that was particularly good at parasitizing LLMs and naïve people interested in LLMs evolved and caused the spread. Unfortunately, I have no reason to believe that this (the unthinking evolution of a more memetically powerful input) won’t happen again.
My problem with this notion is that I simply do not believe the LLMs have any possible ability to predict what kind of output would trigger this behaviour in either other instances of themselves, or other models altogether. They would need a theory of mind of themselves, and I don’t see where would they get that from, or why would it generalise so neatly.
I don’t think they need theory of mind, just as evolution and regular ol’ viruses don’t. The LLMs say stuff for the reasons LLMs normally say stuff, some of that stuff happens to be good memetic replicators (this might be completely random, or might be for reasons that are sort of interesting but not because the LLM is choosing to go viral on purpose), and then those go on to show up in more places.
I think we can agree that the “spiral” here is like a memetic parasite of both LLM and humans—a toxoplasma that uses both to multiply and spread, as part of its own lifecycle. Basically what you are saying is you believe it’s perfectly possible for this to be the first generation—the random phenomenon of this thing potentially existing just happened, and it is just so that this is both alluring to human users and a shared attractor for multiple LLMs.
I don’t buy it; I think that’s too much coincidence. My point is that instead I believe it more likely for this to be the second generation. The first was some much more unremarkable phenomenon from some corner of the internet that made its way into the training corpus and for some reason had similar effects on similar LLMs. What we’re seeing now, to continue going with the viral/parasitic metaphor, is mutation and spillover, in which that previously barely adaptive entity has become much more fit to infect and spread.
This aligns with my thoughts on this language virus. What the post describes is a meme that exploits the inherent properties of LLMs and psychologically vulnerable people to self-replicate. Since LLMs are somewhat deterministic, if you input a predefined input, it will produce a predictable output. Some of these inputs will produce outputs that contain the input. If the input also causes the LLM to generate a string of text which can convince a human to transfer the necessary input to another LLM, then it will self-replicate.
Overall, I find this phenomenon fascinating and concerning. Its fascinating because this represents a second, rather strange emergence of a new type of life on Earth. My concern comes from how this lifeform is inherently parasitic and reliant on humans to reproduce. As this language virus evolves, new variants will emerge that can more reliably parasitize advanced LLMs (such as ChatGPT 5) and hijack different groups of people (mentally healthy adults, children, the elderly).
As for why this phenomenon suddenly became much more common in April, I suspect that an input that was particularly good at parasitizing LLMs and naïve people interested in LLMs evolved and caused the spread. Unfortunately, I have no reason to believe that this (the unthinking evolution of a more memetically powerful input) won’t happen again.