Recommendation: AI developers should...hire psychiatrists and incorporate guidelines from therapy manuals on how to interact with psychosis patients and not just rely on their own intuitions...The main possible downside is that there could be risk compensation
A downside risk that seems much larger to me is excessive false positives – it seems pretty plausible to me that LLMs may end up too ready to stop cooperating with users they think might have psychosis, and rule out all kinds of imaginative play, roleplay, and the exploration of unorthodox but potentially valuable ideas.
The liability risks for AI developers are large here, and it wouldn’t surprise me if the recent lawsuit over a teen who committed suicide leads to significant policy changes at OpenAI and maybe other companies. Recall that ethicists often ‘start with a presumption that risk [is] unacceptable, and weigh benefits only weakly.’
A false negative in any one case is much worse than a false negative – but LLMs may end up tuned such that there will be far more false positives than false negatives; the dust specks may outweigh the torture. If we’re not careful, we may end up in a world with a bit less psychosis and a lot less wild creativity.
Absolutely. I noticed this myself while engaging on controversial topics with LLMs. There is a fine line between being too restrictive and still usable. But the core issue is in the modelfs itself. ChatGPT5 for example mirrors the user less and critically questions more than 4o or Claudes old models.
In the end it all comes down to the user. If you understand how an LLM works and that they are, and cannot, be a conscious being, it is less likely to spiral down that path. Most delusions seem to stem from users believing their AI is alive and they must propagate their theories and hidden secrets.
A huge problem is also epistemic inflation. LLMs used words like recursive everywhere. It sounds scientific and novel to the average user. I am wondering where this epistemic inflation originates from and why it got amplified so much? Probably, as the user wanted to be mirrored and validated, the LLMs started talking back validating the users thoughts and ideas by adding fancy words the user did not understand, but liked, as it made him feel smart and special.
Terrific work, thanks!
A downside risk that seems much larger to me is excessive false positives – it seems pretty plausible to me that LLMs may end up too ready to stop cooperating with users they think might have psychosis, and rule out all kinds of imaginative play, roleplay, and the exploration of unorthodox but potentially valuable ideas.
The liability risks for AI developers are large here, and it wouldn’t surprise me if the recent lawsuit over a teen who committed suicide leads to significant policy changes at OpenAI and maybe other companies. Recall that ethicists often ‘start with a presumption that risk [is] unacceptable, and weigh benefits only weakly.’
A false negative in any one case is much worse than a false negative – but LLMs may end up tuned such that there will be far more false positives than false negatives; the dust specks may outweigh the torture. If we’re not careful, we may end up in a world with a bit less psychosis and a lot less wild creativity.
Absolutely. I noticed this myself while engaging on controversial topics with LLMs. There is a fine line between being too restrictive and still usable. But the core issue is in the modelfs itself. ChatGPT5 for example mirrors the user less and critically questions more than 4o or Claudes old models.
In the end it all comes down to the user. If you understand how an LLM works and that they are, and cannot, be a conscious being, it is less likely to spiral down that path. Most delusions seem to stem from users believing their AI is alive and they must propagate their theories and hidden secrets.
A huge problem is also epistemic inflation. LLMs used words like recursive everywhere. It sounds scientific and novel to the average user. I am wondering where this epistemic inflation originates from and why it got amplified so much? Probably, as the user wanted to be mirrored and validated, the LLMs started talking back validating the users thoughts and ideas by adding fancy words the user did not understand, but liked, as it made him feel smart and special.