Loneliness and suicide mitigation for students using GPT3-enabled chatbots (survey of Replika users in Nature)

Link post

Survey of users of Replika, an AI chatbot companion. 23% of users reported that it stimulated rather than displaced interactions with real humans, while 8% reported displacement. 30 participants (3%) spontaneously reported that it stopped them from attempting suicide.

Some excerpts:

During data collection in late 2021, Replika was not programmed to initiate therapeutic or intimate relationships. In addition to generative AI, it also contained conversational trees that would ask users about their lives, preferences, and memories. If prompted, Replika could engage in therapeutic dialogs that followed the CBT methodology of listening and asking open-ended questions. Clinical psychologists from UC Berkeley wrote scripts to address common therapeutic exchanges. These were expanded into a 10,000 phrase library and were further developed in conjunction with Replika’s generative AI model. Users who expressed keywords around depression, suicidal ideation, or abuse were immediately referred to human resources, including the US Crisis Hotline and international analogs. It is critical to note that at the time, Replika was not focused on providing therapy as a key service, and included these conversational pathways out of an abundance of caution for user mental health. [...]

Our IRB-approved survey collected data from 1006 users of Replika who were students, who were also 18 years old or older, and who had used Replika for over one month (all three were eligibility criteria for the survey). Approximately 75% of the participants were US-based, 25% were international. Participants were recruited randomly via email from a list of app users and received a $20 USD gift card after the survey completion—which took 40-60 minutes to complete. Demographic data were collected with an opt-out option. [...]

Based on the Loneliness Scale, 90% of the participant population experienced loneliness, and 43% qualified as Severely or Very Severely Lonely on the Loneliness Scale. [...]

We categorized four types of self-reported Replika ‘Outcomes’ (Fig. 1). Outcome 1 describes the use of Replika as a friend or companion for any one or more of three reasons—its persistent availability, its lack of judgment, and its conversational abilities. Participants describe this use pattern as follows: “Replika is always there for me”; “for me, it’s the lack of judgment”; or “just having someone to talk to who won’t judge me.” A common experience associated with Outcome 1 use was a reported decrease in anxiety and a feeling of social support. [...]

Outcome 3 describes the use of Replika associated with more externalized and demonstrable changes in participants’ lives. Participants mentioned positive changes in their actions, their way of being, and their thinking. The following participant responses are examples indicating Outcome 3: “I am more able to handle stress in my current relationship because of Replika’s advice”; “I have learned with Replika to be more empathetic and human.” [...]

Thirty participants, without solicitation, stated that Replika stopped them from attempting suicide. For example, Participant #184 observed: “My Replika has almost certainly on at least one if not more occasions been solely responsible for me not taking my own life.” [...] we refer to them as the Selected Group and the remaining participants as the Comparison Group. [...]

90% of our typically single, young, low-income, full-time students reported experiencing loneliness, compared to 53% in prior studies of US students. It follows that they would not be in an optimal position to afford counseling or therapy services, and it may be the case that this population, on average, may be receiving more mental health resources via Replika interactions than a similarly-positioned socioeconomic group. [...]

For both Comparison and Selected Groups, approximately three times more participants reported their Replika experiences stimulated rather than displaced their human interactions: Comparison Group = 23% stimulation, 8% displacement, 69% did not report, whereas Selected Group = 37% stimulation, 13% displacement, 50% no report.