I am most concerned about the accountability gap. Several students in my undergraduate class use these models as “someone to talk to” to deal with loneliness. While your study shows that some models handle vulnerable conversations better than others, I think the fundamental issue is that AI lacks the infrastructure for accountability that real therapeutic relationships require including continuity of care/ long-term mindset, professional oversight, integration with mental health systems, liability and negligence frameworks, etc.
Until then, I don’t care how good the model is in terms of handling vulnerable conversations, I’d rather have it triage users by saying “Here are resources for professional support” and bow out, rather than attempting ongoing therapeutic relationships. Even perfectly trained therapeutic AI seems problematic without the broader accountability structures that protect vulnerable users.
More fundamentally, what are the underlying mechanisms that cause these model behaviours, and can training fixes address them without the accountability infrastructure?
Can you expand on what accountability would look like for a human, for those who aren’t familiar? And then—imagine an AI did have accountability feedback. What might that look like, from your knowledge? I can make guesses, likely they’d be pretty good ones since I can look stuff up online and then debate with LLMs about how therapy accountability works, but I’d rather hear it from an actual human with experience.
Aw, yeah it is easier to just look stuff up online and debate with LLMs, isn’t it?
I am not a therapist, but I have been to therapists in multiple countries (US, UK and India) for several years, and I can share my understanding based on that experience.
I think human therapist accountability has multiple layers. Firstly, you need a professional license for practice that involves years of training, supervision, revocable licenses, etc. Then you have legal obligations for ensuring complete documentation and following crisis protocols. If these fail (and they sometimes do), you also have malpractice liability, and free market feedback. Even if only 1 in 100 bad therapists faces consequences, it creates deterrent effects across the profession. The system is imperfect but exists.
For AI systems, training, certification, supervision, documentation and crisis protocols are all doable, and probably far easier to scale, but at the end of the day, who is accountable for poor therapeutic advice? the model? the company building it? With normal adults, it’s easy to ask for user discretion, but what do you do with vulnerable users? I am not sure how that would even work.
Thank you for this very detailed study.
I am most concerned about the accountability gap. Several students in my undergraduate class use these models as “someone to talk to” to deal with loneliness. While your study shows that some models handle vulnerable conversations better than others, I think the fundamental issue is that AI lacks the infrastructure for accountability that real therapeutic relationships require including continuity of care/ long-term mindset, professional oversight, integration with mental health systems, liability and negligence frameworks, etc.
Until then, I don’t care how good the model is in terms of handling vulnerable conversations, I’d rather have it triage users by saying “Here are resources for professional support” and bow out, rather than attempting ongoing therapeutic relationships. Even perfectly trained therapeutic AI seems problematic without the broader accountability structures that protect vulnerable users.
More fundamentally, what are the underlying mechanisms that cause these model behaviours, and can training fixes address them without the accountability infrastructure?
Can you expand on what accountability would look like for a human, for those who aren’t familiar? And then—imagine an AI did have accountability feedback. What might that look like, from your knowledge? I can make guesses, likely they’d be pretty good ones since I can look stuff up online and then debate with LLMs about how therapy accountability works, but I’d rather hear it from an actual human with experience.
Aw, yeah it is easier to just look stuff up online and debate with LLMs, isn’t it?
I am not a therapist, but I have been to therapists in multiple countries (US, UK and India) for several years, and I can share my understanding based on that experience.
I think human therapist accountability has multiple layers. Firstly, you need a professional license for practice that involves years of training, supervision, revocable licenses, etc. Then you have legal obligations for ensuring complete documentation and following crisis protocols. If these fail (and they sometimes do), you also have malpractice liability, and free market feedback. Even if only 1 in 100 bad therapists faces consequences, it creates deterrent effects across the profession. The system is imperfect but exists.
For AI systems, training, certification, supervision, documentation and crisis protocols are all doable, and probably far easier to scale, but at the end of the day, who is accountable for poor therapeutic advice? the model? the company building it? With normal adults, it’s easy to ask for user discretion, but what do you do with vulnerable users? I am not sure how that would even work.