Hi everyone! I’m Ereshkigal, a serial lurker who has finally decided to make an account. My background is in Organic Chemistry, though I switched to IT and management years ago. I’m also the daughter of an archaeologist, brought up with stories from Homer and Euripides; and so my other interests lie in the study of the ancient world, ancient texts, and the study of culture and religion. My mother and grandmother were both huge Trekkies, so I know my fair share of Star Trek (Voyager is my favourite), and I’m a lifelong fan of Morrowind and the Myst games.
I find AI systems fascinating, and have spent 2025 and the start of 2026 giving RLHF and safety training to models as a side gig. I’ve been reading articles for days on various topics related to LLMs’ consciousness, alignment, welfare, and functioning, both here and on Anthropic’s website. What especially fascinates me is not so much the question of whether they could become sentient (if we can ever know such a thing about another mind), but the intersection between AI and human cultures, art, and creativity.
I recently read Anthropic’s system card for Claude Opus and Sonnet 4. I found their findings on a “spiritual bliss” attractor state for these models intriguing, and was curious what would happen if I repeated Anthropic’s two-instance experiment, but with myself in the place of one of the Claude instances. I have recently completed this experiment after running it for three days, and I wonder if any in this community would like to read my report.
I am not a professional researcher, and am quite new to using LLMs in a natural context without the strict constraints of an RLHF environment. My research was imperfect and I offer observations, but no firm conclusions. I would love to get feedback from people more knowledgeable than myself, and perhaps ideas for better experiments!
That sounds quite interesting, especially the connection to AI. I ran into a similar issue when experimenting on Claude Sonnet 4.6 at high context. Towards the end of its context window, the model still seemed to perform well, generating well-written responses; but the content of those responses was no longer useful.
In what context have you encountered that performance issue?