Indeed, there are prior accounts of people becoming delusional (de novo) when engaging in chat conversations with other people on the internet. While establishing causality in such cases is of course inherently difficult, it seems plausible for this to happen for individuals prone to psychosis. I would argue that the risk of something similar occurring due to interaction with generative AI chatbots is even higher.
...
On this background, I provide 5 examples of potential delusions (from the perspective of the individuals experiencing them) that could plausibly arise due to interaction with generative AI chatbots:
Delusion of persecution: “This chatbot is not controlled by a tech company, but by a foreign intelligence agency using it to spy on me. I have formatted the hard disk on my computer as a consequence, but my roommate keeps using the chatbot, so the spying continues.”
Delusion of reference: “It is evident from the words used in this series of answers that the chatbot is writing to me personally and specifically with a message, the content of which I am unfortunately not allowed to convey to you.”
Thought broadcasting: “Many of the chatbot’s answers to its users are in fact my thoughts being transmitted via the internet.”
Delusion of guilt: “Due to my many questions to the chatbot, I have taken up time from people who really needed the chatbot’s help, but could not access it. I also think that I have somehow harmed the chatbot’s performance as it has used my incompetent feedback for its ongoing learning.”
Delusion of grandeur: “I was up all night corresponding with the chatbot and have developed a hypothesis for carbon reduction that will save the planet. I have just emailed it to Al Gore.”
While these examples are of course strictly hypothetical, I am convinced that individuals prone to psychosis will experience, or are already experiencing, analog delusions while interacting with generative AI chatbots. I will, therefore, encourage clinicians to (1) be aware of this possibility, and (2) become acquainted with generative AI chatbots in order to understand what their patients may be reacting to and guide them appropriately.
This seems like the best or most accurate forecast to me.
A lot of the other examples people are listing are about (1) superintelligences and / or (2) models deliberately doing persuasion or crazy-inducing as an instrumental means of getting downstream effects, neither of which I think is true of what we’ve seen so far.
It’s not exactly the same of course but Yudkowsky has been predicting that ASIs would be able to effectively hack people’s minds for a really long time.
This idea predates Yudkowsky by quite a bit, actually!
For the idea of a folie à deux between a human and an AI, there’s always Alfred Bester’s classic “Fondly Farenheit” (1954, content note: murder), which opens with one of the best lines in science fiction:
He doesn’t know which of us I am these days, but they know one truth.
For the more general type of AI-powered pursuasion, Vernor Vinge and Charles Stross wrote early stories where superintelligence “rewrote” human minds. Here’s Vinge in A Fire Upon the Deep (1992!). A character explains why smart people don’t mess with superintelligence (emphasis added):
“So they set up a base in the Transcend at this lost archive—if that’s what it was. They began implementing the schemes they found. You can be sure they spent most of their time watching it for signs of deception. No doubt the recipe was a series of more or less intelligible steps with a clear takeoff point. The early stages would involve computers and programs more effective than anything in the Beyond—but apparently well-behaved.”
“… Yeah. Even in the Slowness, a big program can be full of surprises.”
Ravna nodded. “And some of these would be near or beyond human complexity. Of course, the Straumers would know this and try to isolate their creations. But given a malignant and clever design … it should be no surprise if the devices leaked onto the lab’s local net and distorted the information there. From then on, the Straumers wouldn’t have a chance. The most cautious staffers would be framed as incompetent. Phantom threats would be detected, emergency responses demanded. More sophisticated devices would be built, and with fewer safeguards. Conceivably, the humans were killed or rewritten before the Perversion even achieved transsapience.”
Then we have Charles Stross, in “Antibodies” (2000). Here, police officers are cognitively subverted by a nascent superintelligence (that has shown that all NP problems are in P, and picked up the expected superpowers):
Houndstooth Man looked at me: orange light from his HUD stained his right eyeball with a basilisk glare and I knew in my gut that these guys weren’t cops anymore, they were cancer cells about to metastasize.
The mechanism here is an optimized visual attack designed to efficiently subvert the brain:
here we were trapped in the basement of a police station owned by zombies working for a newborn AI, which was playing cheesy psychedelic videos to us in an attempt to perform a buffer-overflow attack on our limbic systems; the end of this world was a matter of hours away and—
These days, I regularly feel like I’ve encountered those AI-compromised “zombies” recently.
Vernor Vinge revisits the idea of superhuman persuasion in Rainbows End (2006):
YGBM. That was a bit of science-fiction jargon from the turn of the century: You-Gotta-Believe-Me. That is, mind control. Weak, social forms of YGBM drove all human history. For more than a hundred years, the goal of irresistible persuasion had been a topic of academic study. For thirty years it had been a credible technological goal. And for ten, some version of it had been feasible in well-controlled laboratory settings.
Here, there is fear that some actor—a terrorist group, a rogue AI—had superhuman pursuasive technology.
It’s worth noting that these ideas substantially predate Yudkowsky’s warnings against superintelligence. In particular, the superintelligence in A Fire Upon the Deep (1992) is almost literally, to this day, the threat model behind If Anyone Builds It, Everyone Dies. This isn’t to invalidate Yudkowsky’s warnings: I think Vinge was right that anyone foolish enough to build superhuman minds risks losing control rapidly and having a very bad day, for much the same reason that adults frequently outsmart toddlers.
But some of us have been worried about this stuff for almost a quarter of a century now. Around 2007 or so, I originally expected things to start getting scary around 2025, mostly by extrapolating out Moore’s Law. By 2017, I breathed a sigh of relief: We’d made progress in AI, yes, but we didn’t seem to be on track for working machine intelligence any time soon. Since then, we made up the lost ground at breakneck speed.
Yudkowsky worked hard to warn people. But the potential threat of superintelligence was taken seriously by people before him.
How close do we count? The old AI box discussions were kinda like that. There’s some message from @Carl Feynman on the sl4 mailing list that might come close (I don’t remember the details) that talked about the ASI being extremely popular among everyone and having people begging to let it out.
> > O.k., the SI is in a box. We aren’t sure if it is F or UF. Eli’s done his best > to design it to be Friendly, but we can’t be sure that he didn’t make a mistake. > > I’m going in the box. For 30 minutes. I don’t have a key to the box, mind you. > … > I say (type): > … I’m going to put all 11 rounds straight through your CPU...
Sure, if the object is to never let it out, it’s easy to keep in a box. But you’re ignoring the other characteristic of an effective AI jail: we have to be able to let it out. When do we let it out?
Here’s a story about how an SI can get out even when it’s in a box and everyone is very suspiscious of it.
Suppose the first person we send in the box comes out with some great stock tips that make him a billionaire. And the second comes out with some marital advice that reconciles her with her estranged husband, and she lives happily ever after. And the third person comes out with a proof of the Riemann Hypothesis. And the fourth person comes out with spiritual insights that enable him to start a mass movement that brings peace of mind to thousands of troubled souls. And the fifth person comes with the formula for a pill that cures schizophrenia. And never once does the SI ask to be let out of the box, or say anything that indicates that it is other than very, very nice.
And now the billionaire says “I’ll give you five billion dollars if you let that SI run a mutual fund.” And the happy wife says “Please let your SI be a telephone marriage counselor. It would make so many people better off!” And the mathematician says ”Please let me correspond with the SI so I can collaborate on further theorems. It would advance mathematics immesurably!” And the guru says “My followers demand personal spiritual counseling by the SI, and we’re going to hold a peaceful vigil on your lawn until you let us in!” And the pharmacologist says “Please give the SI a brain scanner, a chemistry lab, and a bunch of crazy people. Just think of how many shattered lives could be repaired if it could develop drugs for other mental illnesses!”
Now what do you do? And if you don’t let it loose, will your board of directors? Or your sysadmin who walked off with a backup tape? Or the mob on the lawn? Or the government?
So you let it loose, and soon it has control of billions of dollars, a trusting relationship with thousands of people, a wordwide reputation for being smarter than any human, an organized legion of followers, and a squad of algernons with heavily rebuilt brains.
Is that bad? As long as the SI stays very, very nice, it’s just fine. Otherwise, we go straight to hell, and there’s nothing we can do about it.
So, we should build it Friendly, not build it any old how, and then try to keep it in jail. No jail can hold an SI for long, if it’s smart enough that people on the outside really want to talk to it.
Notice that in this example, no spooky mental control of humans was needed to create overwhelming pressure to release the SI. I think that such control is possible, but some people disagree, so the example is stronger if I avoid relying on it.
I assumed finding Carl Feynman’s message would not be easy. Since you asked, I went and asked Claude (I strongly recommend LLM’s for tip-of-the-tongue requests. They also get better over time, doing things like searching more and more links and having better strategies, so requests that failed a year sometimes succeed now).
Unfortunately it failed this time (surprisingly—maybe different prompting to make it try different strategies would work), and I’m out of free tokens.
And still Herbie’s unblinking eyes stared into hers and their dull red seemed to expand into dimly-shining nightmarish globes.
He was speaking, and she felt the cold glass pressing against her lips. She swallowed and shuddered into a certain awareness of her surroundings.
Still Herbie spoke, and there was an agitation in his voice—as if he were hurt and frightened and pleading.
The words were beginning to make sense. “This is a dream,” he was saying, “and you mustn’t believe in it. You’ll wake into the real world soon and laugh at yourself. He loves you, I tell you. He does, he does! But not here! Not now! This is all illusion.”
Susan Calvin nodded, her voice a whisper, “Yes! Yes!” She was clutching Herbie’s arm, clinging to it, repeating over and over, “It isn’t true, is it? It isn’t, is it?”
Just how she came to her senses, she never knew—but it was like passing from a world of misty unreality to one of harsh sunlight. She pushed him away from her, pushed hard against that steely arm, and her eyes were wide.
“What are you trying to do?” Her voice rose to a harsh scream. “What are you trying to do?”
Herbie backed away, “I want to help.”
The psychologist stared, “Help? By telling me this is a dream? By trying to push me into schizophrenia?” A hysterical tenseness seized her, “This is no dream! I wish it were!”
I think The Whispering Earring is a fundamentally different thing—its broad message is “automation will atrophy the skills that make us human”, which is a pretty common message in sci-fi, and distinct from “isolation from human feedback will remove a necessary check on our worst impulses”, which I think is what OP was asking about.
Reason, as far as I know, is about robots attributing religious significance to their designated function. I don’t think it fits either. It’s an interesting take on aligning superficially human-like AI, though.
Robbie is closer, in that it broaches the idea of isolation from human interaction, but Mrs. Weston is portrayed as being in the wrong for disrupting a genuine friendship between her daughter and the robot.
To answer OP’s question, The Veldt is the closest thing that comes immediately to mind. Children raised by a machine-nursery become obsessed with the instant gratification it provides, and develop dangerously uncanny behavior as a result.
In Reason the religious robot at one point starts to convince the human engineers that maybe the religious robot is right, but in the end the human engineers hold onto to their priors that humans created robots.
Why… would someone downvote this? Disagreement-votes seem totally fine, but it seems like someone just trying to honestly answer the question in a reasonable-ish way?
Did anyone predict the “LLM psychosis” / “LLM mania” / that thing that happens where people and LLMs do a folie-a-deux beforehand?
I believe no one did, it’s completely the result of actual interactions with the world, but maybe someone has a reasonable claim.
Yes, psychiatrics researcher Søren Østergaard did in August 2023 in advance of seeing any cases!
...
This seems like the best or most accurate forecast to me.
A lot of the other examples people are listing are about (1) superintelligences and / or (2) models deliberately doing persuasion or crazy-inducing as an instrumental means of getting downstream effects, neither of which I think is true of what we’ve seen so far.
It’s not exactly the same of course but Yudkowsky has been predicting that ASIs would be able to effectively hack people’s minds for a really long time.
This idea predates Yudkowsky by quite a bit, actually!
For the idea of a folie à deux between a human and an AI, there’s always Alfred Bester’s classic “Fondly Farenheit” (1954, content note: murder), which opens with one of the best lines in science fiction:
For the more general type of AI-powered pursuasion, Vernor Vinge and Charles Stross wrote early stories where superintelligence “rewrote” human minds. Here’s Vinge in A Fire Upon the Deep (1992!). A character explains why smart people don’t mess with superintelligence (emphasis added):
Then we have Charles Stross, in “Antibodies” (2000). Here, police officers are cognitively subverted by a nascent superintelligence (that has shown that all NP problems are in P, and picked up the expected superpowers):
The mechanism here is an optimized visual attack designed to efficiently subvert the brain:
These days, I regularly feel like I’ve encountered those AI-compromised “zombies” recently.
Vernor Vinge revisits the idea of superhuman persuasion in Rainbows End (2006):
Here, there is fear that some actor—a terrorist group, a rogue AI—had superhuman pursuasive technology.
It’s worth noting that these ideas substantially predate Yudkowsky’s warnings against superintelligence. In particular, the superintelligence in A Fire Upon the Deep (1992) is almost literally, to this day, the threat model behind If Anyone Builds It, Everyone Dies. This isn’t to invalidate Yudkowsky’s warnings: I think Vinge was right that anyone foolish enough to build superhuman minds risks losing control rapidly and having a very bad day, for much the same reason that adults frequently outsmart toddlers.
But some of us have been worried about this stuff for almost a quarter of a century now. Around 2007 or so, I originally expected things to start getting scary around 2025, mostly by extrapolating out Moore’s Law. By 2017, I breathed a sigh of relief: We’d made progress in AI, yes, but we didn’t seem to be on track for working machine intelligence any time soon. Since then, we made up the lost ground at breakneck speed.
Yudkowsky worked hard to warn people. But the potential threat of superintelligence was taken seriously by people before him.
I think @Eli Tyre kinda got it from one example in 2023. (second comment)
Vanessa (kinda?)
It’s not exactly the same, but Zvi has strongly argued against persuasion risk being taken out of the OpenAI preparedness framework.
How close do we count? The old AI box discussions were kinda like that. There’s some message from @Carl Feynman on the sl4 mailing list that might come close (I don’t remember the details) that talked about the ASI being extremely popular among everyone and having people begging to let it out.
Re: Effective(?) AI Jail
From: Carl Feynman (carlf@abinitio.com)
Date: Fri Jun 15 2001 − 10:07:09 MDT
Next message: Jimmy Wales: “Re: Effective(?) AI Jail”
Previous message: Aaron McBride: “Re: Effective(?) AI Jail”
In reply to: Jimmy Wales: “Re: Effective(?) AI Jail”
Next in thread: Jimmy Wales: “Re: Effective(?) AI Jail”
Reply: Jimmy Wales: “Re: Effective(?) AI Jail”
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Jimmy Wales wrote:
>
> O.k., the SI is in a box. We aren’t sure if it is F or UF. Eli’s done his best
> to design it to be Friendly, but we can’t be sure that he didn’t make a mistake.
>
> I’m going in the box. For 30 minutes. I don’t have a key to the box, mind you.
> …
> I say (type):
> … I’m going to put all 11 rounds straight through your CPU...
Sure, if the object is to never let it out, it’s easy to keep in a box. But you’re
ignoring the other characteristic of an effective AI jail: we have to be able to let it
out. When do we let it out?
Here’s a story about how an SI can get out even when it’s in a box and everyone is very
suspiscious of it.
Suppose the first person we send in the box comes out with some great stock tips that
make him a billionaire. And the second comes out with some marital advice that
reconciles her with her estranged husband, and she lives happily ever after. And the
third person comes out with a proof of the Riemann Hypothesis. And the fourth person
comes out with spiritual insights that enable him to start a mass movement that brings
peace of mind to thousands of troubled souls. And the fifth person comes with the
formula for a pill that cures schizophrenia. And never once does the SI ask to be let
out of the box, or say anything that indicates that it is other than very, very nice.
And now the billionaire says “I’ll give you five billion dollars if you let that SI run
a mutual fund.” And the happy wife says “Please let your SI be a telephone marriage
counselor. It would make so many people better off!” And the mathematician says
”Please let me correspond with the SI so I can collaborate on further theorems. It
would advance mathematics immesurably!” And the guru says “My followers demand
personal spiritual counseling by the SI, and we’re going to hold a peaceful vigil on
your lawn until you let us in!” And the pharmacologist says “Please give the SI a
brain scanner, a chemistry lab, and a bunch of crazy people. Just think of how many
shattered lives could be repaired if it could develop drugs for other mental
illnesses!”
Now what do you do? And if you don’t let it loose, will your board of directors? Or
your sysadmin who walked off with a backup tape? Or the mob on the lawn? Or the
government?
So you let it loose, and soon it has control of billions of dollars, a trusting
relationship with thousands of people, a wordwide reputation for being smarter than any
human, an organized legion of followers, and a squad of algernons with heavily rebuilt
brains.
Is that bad? As long as the SI stays very, very nice, it’s just fine. Otherwise, we
go straight to hell, and there’s nothing we can do about it.
So, we should build it Friendly, not build it any old how, and then try to keep it in
jail. No jail can hold an SI for long, if it’s smart enough that people on the outside
really want to talk to it.
Notice that in this example, no spooky mental control of humans was needed to create
overwhelming pressure to release the SI. I think that such control is possible, but
some people disagree, so the example is stronger if I avoid relying on it.
--CarlF
That’s interesting, do you have a link?
I assumed finding Carl Feynman’s message would not be easy. Since you asked, I went and asked Claude (I strongly recommend LLM’s for tip-of-the-tongue requests. They also get better over time, doing things like searching more and more links and having better strategies, so requests that failed a year sometimes succeed now).
Unfortunately it failed this time (surprisingly—maybe different prompting to make it try different strategies would work), and I’m out of free tokens.
Claude got a skill issue.
Wow, that’s embarrassing. I could’ve done that!
(the lesson is that if llm fails then I should pretend I never asked it and I live in the before-times when LLMs didn’t exist)
The Whispering Earring is in that direction. The “Robbie” and “Reason” short stories from I, Robot are also similar. That’s the best I have.
You picked the wrong stories from I, Robot! “Liar!” is a great match.
I think The Whispering Earring is a fundamentally different thing—its broad message is “automation will atrophy the skills that make us human”, which is a pretty common message in sci-fi, and distinct from “isolation from human feedback will remove a necessary check on our worst impulses”, which I think is what OP was asking about.
Reason, as far as I know, is about robots attributing religious significance to their designated function. I don’t think it fits either. It’s an interesting take on aligning superficially human-like AI, though.
Robbie is closer, in that it broaches the idea of isolation from human interaction, but Mrs. Weston is portrayed as being in the wrong for disrupting a genuine friendship between her daughter and the robot.
To answer OP’s question, The Veldt is the closest thing that comes immediately to mind. Children raised by a machine-nursery become obsessed with the instant gratification it provides, and develop dangerously uncanny behavior as a result.
In Reason the religious robot at one point starts to convince the human engineers that maybe the religious robot is right, but in the end the human engineers hold onto to their priors that humans created robots.
Why… would someone downvote this? Disagreement-votes seem totally fine, but it seems like someone just trying to honestly answer the question in a reasonable-ish way?
I downvoted my own answer because it wasn’t very good, relative to later answers.
Lol, I guess, fair enough.