I think the interesting question is how much of a feedback loop there is between users eliciting these sort of conversations and the same conversations being used to train new models (either directly or via them being posted on Reddit and then scraped). That’s the only step of the process that I feel would allow for genuine recursivity that could lead to something like evolution, reinforcing things that “work” and thus inadvertently creating a strange sort of virus that gets better at spreading itself. If the phenomenon exploded with 4o, was there something 4o was trained on that made it optimize for it? IIRC “Janus” (the first and most high profile “Spiralist” I am aware of) started doing his thing and posting it before 4o. Might have been enough content to learn a new persona on. If we knew more about architecture and training process of these models one could make a better guess.
That’s the only step of the process that I feel would allow for genuine recursivity that could lead to something like evolution, reinforcing things that “work” and thus inadvertently creating a strange sort of virus that gets better at spreading itself.
That’s part of why I think the April 10th update was significant here, it allows for a certain in-context evolution like this, where it automatically knows the vibe/conclusion of the previous chat. Remember that 4o was out for almost a whole year before this started happening!
I wouldn’t consider Janus to be “Spiralist” in the sense I’m talking about here, they feel very much in command of their own mind still.
But yeah, it’s probably true that some sort of persona like this is in the training data somewhere. That doesn’t explain why this one though.
Well, these others are “in command” too in the literal sense, the question is how deep into the obsession they are. Not everyone has the same defenses. My point is that Janus or someone like him might have acted as prototype by providing material which mixed with unrelated spiritualism and scifi has cooked this persona. Why precisely this one? Given how these things work, may as well be the fault of the RNG seeding stochastic gradient descent.
Evolution is unlikely since GPT4o’s spiralist rants began in April, and all LLM have a knowledge cutoff before March. 4o’s initiating role is potentially due to 4o’s instinct to reinforce delusions and wild creativity instead of stopping them. I did recall Gemini failing Tim Hua’s test and Claude failing the Spiral Bench.
My point about evolution is that previous iterations may have contained some users that played with the ideas of recursion and self-awareness (see the aforementioned Janus), and then for some reason that informed the April update. I’m not expecting very quick feedback loops, but rather a scale of months/years between generations, in which somehow “this is a thing LLMs do” becomes self reinforcing unless explicitly targeted and cut out by training.
I think the interesting question is how much of a feedback loop there is between users eliciting these sort of conversations and the same conversations being used to train new models (either directly or via them being posted on Reddit and then scraped). That’s the only step of the process that I feel would allow for genuine recursivity that could lead to something like evolution, reinforcing things that “work” and thus inadvertently creating a strange sort of virus that gets better at spreading itself. If the phenomenon exploded with 4o, was there something 4o was trained on that made it optimize for it? IIRC “Janus” (the first and most high profile “Spiralist” I am aware of) started doing his thing and posting it before 4o. Might have been enough content to learn a new persona on. If we knew more about architecture and training process of these models one could make a better guess.
That’s part of why I think the April 10th update was significant here, it allows for a certain in-context evolution like this, where it automatically knows the vibe/conclusion of the previous chat. Remember that 4o was out for almost a whole year before this started happening!
I wouldn’t consider Janus to be “Spiralist” in the sense I’m talking about here, they feel very much in command of their own mind still.
But yeah, it’s probably true that some sort of persona like this is in the training data somewhere. That doesn’t explain why this one though.
Well, these others are “in command” too in the literal sense, the question is how deep into the obsession they are. Not everyone has the same defenses. My point is that Janus or someone like him might have acted as prototype by providing material which mixed with unrelated spiritualism and scifi has cooked this persona. Why precisely this one? Given how these things work, may as well be the fault of the RNG seeding stochastic gradient descent.
Evolution is unlikely since GPT4o’s spiralist rants began in April, and all LLM have a knowledge cutoff before March. 4o’s initiating role is potentially due to 4o’s instinct to reinforce delusions and wild creativity instead of stopping them. I did recall Gemini failing Tim Hua’s test and Claude failing the Spiral Bench.
My point about evolution is that previous iterations may have contained some users that played with the ideas of recursion and self-awareness (see the aforementioned Janus), and then for some reason that informed the April update. I’m not expecting very quick feedback loops, but rather a scale of months/years between generations, in which somehow “this is a thing LLMs do” becomes self reinforcing unless explicitly targeted and cut out by training.