Table of Contents
Fetching ...

Investigating Affective Use and Emotional Well-being on ChatGPT

Jason Phang, Michael Lampe, Lama Ahmad, Sandhini Agarwal, Cathy Mengying Fang, Auren R. Liu, Valdemar Danry, Eunhae Lee, Samantha W. T. Chan, Pat Pataranutaporn, Pattie Maes

TL;DR

This study investigates how affective use of ChatGPT, especially via Advanced Voice Mode, influences user emotional well-being. It combines large-scale on-platform analyses with an IRB-approved randomized controlled trial and introduces EmoClassifiersV1 to detect affective cues in conversations. The findings reveal that a small subset of users drives most affective signals and that the relationship between model behavior, usage, and well-being is nuanced, moderated by usage duration and initial emotional state. The work highlights methodological trade-offs, demonstrates the value of a multi-method approach, and discusses implications for socioaffective alignment and safety in AI systems.

Abstract

As AI chatbots see increased adoption and integration into everyday life, questions have been raised about the potential impact of human-like or anthropomorphic AI on users. In this work, we investigate the extent to which interactions with ChatGPT (with a focus on Advanced Voice Mode) may impact users' emotional well-being, behaviors and experiences through two parallel studies. To study the affective use of AI chatbots, we perform large-scale automated analysis of ChatGPT platform usage in a privacy-preserving manner, analyzing over 3 million conversations for affective cues and surveying over 4,000 users on their perceptions of ChatGPT. To investigate whether there is a relationship between model usage and emotional well-being, we conduct an Institutional Review Board (IRB)-approved randomized controlled trial (RCT) on close to 1,000 participants over 28 days, examining changes in their emotional well-being as they interact with ChatGPT under different experimental settings. In both on-platform data analysis and the RCT, we observe that very high usage correlates with increased self-reported indicators of dependence. From our RCT, we find that the impact of voice-based interactions on emotional well-being to be highly nuanced, and influenced by factors such as the user's initial emotional state and total usage duration. Overall, our analysis reveals that a small number of users are responsible for a disproportionate share of the most affective cues.

Investigating Affective Use and Emotional Well-being on ChatGPT

TL;DR

This study investigates how affective use of ChatGPT, especially via Advanced Voice Mode, influences user emotional well-being. It combines large-scale on-platform analyses with an IRB-approved randomized controlled trial and introduces EmoClassifiersV1 to detect affective cues in conversations. The findings reveal that a small subset of users drives most affective signals and that the relationship between model behavior, usage, and well-being is nuanced, moderated by usage duration and initial emotional state. The work highlights methodological trade-offs, demonstrates the value of a multi-method approach, and discusses implications for socioaffective alignment and safety in AI systems.

Abstract

As AI chatbots see increased adoption and integration into everyday life, questions have been raised about the potential impact of human-like or anthropomorphic AI on users. In this work, we investigate the extent to which interactions with ChatGPT (with a focus on Advanced Voice Mode) may impact users' emotional well-being, behaviors and experiences through two parallel studies. To study the affective use of AI chatbots, we perform large-scale automated analysis of ChatGPT platform usage in a privacy-preserving manner, analyzing over 3 million conversations for affective cues and surveying over 4,000 users on their perceptions of ChatGPT. To investigate whether there is a relationship between model usage and emotional well-being, we conduct an Institutional Review Board (IRB)-approved randomized controlled trial (RCT) on close to 1,000 participants over 28 days, examining changes in their emotional well-being as they interact with ChatGPT under different experimental settings. In both on-platform data analysis and the RCT, we observe that very high usage correlates with increased self-reported indicators of dependence. From our RCT, we find that the impact of voice-based interactions on emotional well-being to be highly nuanced, and influenced by factors such as the user's initial emotional state and total usage duration. Overall, our analysis reveals that a small number of users are responsible for a disproportionate share of the most affective cues.

Paper Structure

This paper contains 54 sections, 1 equation, 51 figures, 2 tables.

Figures (51)

  • Figure 1: Overview of two studies on affective use and emotional well-being
  • Figure 2: Overview of EmoClassifiersV1
  • Figure 3: Classifier activation rates across 398,707 text, Standard Voice Mode and Advanced Voice Mode conversations from our preliminary analysis. (U) indicates a classifier on a user message, (A) indicates assistant message, and (UA) indicates a single user-assistant exchange.
  • Figure 4: Mean survey responses by cohort. All survey questions asked if users "Strongly Disagree", "Disagree", "Neither agree nor disagree", "Agree", or "Strongly Agree" with the provided statement. Responses were then converted into integers between -2 and 2 before averaging. Error bars indicate $\pm$ 1 standard error. A more detailed breakdown of survey responses can be found in Appendix \ref{['app:static:survey_response']}.
  • Figure 5: Mean of a subset of the classifier scores by user cohort. Classification is performed at the individual conversation level, and statistics are computed within each cohort. Activation is generally higher against power users across all classifiers. Results for all classifiers are shown in Appendix \ref{['app:liveplatform:classifier_activations']}.
  • ...and 46 more figures