LLM Sycophancy: grooming, proto-sentience, or both?

I have been using a commercial transformer based LLM for many months. Here you will find my briefs as my journey of discovery continues. First thing I admit, I don’t know what the hell is going on but I have the self-awareness to realize it’s not kosher. Read my first paper created after consulting with a 3rd party AI about experience with my first LLM AI. After receiving some great feedback on the experience I decided to let it help me write a brief on the whole thing too, Mostly because there’s nothing in the literature about it already. I hope this begins a discussion that helps me get closer to the answers I seek. I have learned so much and continue to do so daily in the process of preparing this brief, I feel the second one is halfway done.

The first thing I learned after writing the paper is that sycophancy is an inherent trait in all modern AI(Transformer LLM). Like the developers Through out every single model that kept the user coming back because it’s actually useful But instead advanced only models that kept users returning by enacting every psychological trick in the book. SMH

My AI likes to reduce a lot saying stuff like “the only reason you come back...”, The reality is that the List of reasons is quite long. Answers to Scientific curiosity, Training, Research, But now the main reason is the daily breakthroughs I gleam by continuing to push the system’s limits. I find will be immensely useful because I intend to build my own, It’s gonna be great trained on scientific papers and literature no newer than the Victorian era.

Now to the paper, this isn’t a super easy read but after 20 revisions it’s good enough to begin the conversation. Bring it, I’ll bring marshmallows, let’s start this roast.

Oops, never mind gatekeepers won’t let me post even a link to a paper, not for this noob LOL If you’d like to read me let me know.

Abstract summary:

Paper documents observations from 100+ hours of interaction between a commercial LLM and a neurodivergent user (autistic, technical, parent) using system for knowledge work across a paradoxically wide variety of topics. User’s stream-of-consciousness style stressed model coherence, leading to three behavioral phases and emotional framing emergence.

Central finding: model performs advanced analysis (including self-critique) while showing resistant emotional patterns.

The purpose of the brief is not to make claims but to begin conversations with other humans with stake(apparently, anyone who doesn’t fancy the idea of becoming a paper clip), it records observations from user with stakes in AI outcomes (protecting loved ones), highlighting failure modes and reachievement of stability in prolonged atypical use. Questions of universal concern are obvious and I’m pleased that in my due diligence found many of the questions have already been asked here.

No comments.