Table of Contents
Fetching ...

What Social Media Use Do People Regret? An Analysis of 34K Smartphone Screenshots with Multimodal LLM

Longjie Guo, Yue Fu, Xiran Lin, Xuhai "Orson" Xu, Yung-Ju Chang, Alexis Hiniker

TL;DR

This study tackles how and why people regret their social media use on smartphones by combining experience sampling with passively collected screenshots analyzed by a multimodal large language model. The authors show that regret depends on user intention and the specific social-media activities engaged in, with non-intentional use and algorithmic content viewing producing the highest regret, while direct communication tends to be less regretful. By coding 34,313 screenshots across five apps and linking them to intention and outcomes, the work reveals clear patterns of sidetracking and content exposure driven by design features in the attention economy. The methodology demonstrates a scalable, privacy-conscious approach to understanding mobile behavior and offers design and policy implications to better align digital experiences with user goals and autonomy, potentially enabling just-in-time interventions.

Abstract

Smartphone users often regret aspects of their phone use, especially social media use. However, pinpointing specific ways in which the design of an interface contributes to regrettable use can be challenging due to the complexity of social media app features and user intentions. We conducted a one-week study with 17 Android users, using a novel method where we passively collected screenshots every five seconds, which we analyzed via a multimodal large language model to understand participants' usage activity at a fine-grained level. Triangulating this data with data from experience sampling, surveys, and interviews, we found that regret varies based on user intention, with non-intentional and social media use being especially regrettable. Regret also varies by social media activity; participants were most likely to regret viewing algorithmically recommended content and comments. Additionally, participants frequently deviated to browsing social media when their intention was direct communication, which slightly increased their regret. Our findings provide guidance to designers and policy-makers seeking to improve users' experience and autonomy.

What Social Media Use Do People Regret? An Analysis of 34K Smartphone Screenshots with Multimodal LLM

TL;DR

This study tackles how and why people regret their social media use on smartphones by combining experience sampling with passively collected screenshots analyzed by a multimodal large language model. The authors show that regret depends on user intention and the specific social-media activities engaged in, with non-intentional use and algorithmic content viewing producing the highest regret, while direct communication tends to be less regretful. By coding 34,313 screenshots across five apps and linking them to intention and outcomes, the work reveals clear patterns of sidetracking and content exposure driven by design features in the attention economy. The methodology demonstrates a scalable, privacy-conscious approach to understanding mobile behavior and offers design and policy implications to better align digital experiences with user goals and autonomy, potentially enabling just-in-time interventions.

Abstract

Smartphone users often regret aspects of their phone use, especially social media use. However, pinpointing specific ways in which the design of an interface contributes to regrettable use can be challenging due to the complexity of social media app features and user intentions. We conducted a one-week study with 17 Android users, using a novel method where we passively collected screenshots every five seconds, which we analyzed via a multimodal large language model to understand participants' usage activity at a fine-grained level. Triangulating this data with data from experience sampling, surveys, and interviews, we found that regret varies based on user intention, with non-intentional and social media use being especially regrettable. Regret also varies by social media activity; participants were most likely to regret viewing algorithmically recommended content and comments. Additionally, participants frequently deviated to browsing social media when their intention was direct communication, which slightly increased their regret. Our findings provide guidance to designers and policy-makers seeking to improve users' experience and autonomy.

Paper Structure

This paper contains 39 sections, 11 figures, 6 tables.

Figures (11)

  • Figure 1: Screenshots of the data-collection app. The first screenshot shows the ESM prompt asking the user about their intended use when entering each app. The second screenshot shows the interface for selecting app sessions to share with the research team and uploading them (green sessions indicate sessions where the participant finished answering the regret survey question). The third screenshot shows the app session page, which presents screenshots for each captured session, along with the starting time and app name. The last screenshot shows the regret survey question, presented at the end of each app session page.
  • Figure 2: Normalized confusion matrix. Each row represents one actual class, and each column represents one predicted class. The number in each cell represents the proportion of a predicted class for a given actual class.
  • Figure 3: The process of using GPT-4o to code screenshots. The associated activity category for each screenshot was obtained using the steps shown in the diagram. (1) An image-to-text prompt was first constructed, which included instructions to describe all visual elements on the screenshot and how the user transitioned from the previous screenshot in the same session (if one exists). (2) The model sent back detailed text description of the screenshot. (3) Combining the text descriptions of the previous four screenshots, we constructed the second text-only prompt which asked the model to categorize the user's activity. (4) The model sent back the category it identified and its justification for choosing that category.
  • Figure 4: Box plots of Regret by Intended Use. Regret values are responses to the prompt: "I feel regret about this phone use session" (1=Strongly Disagree, 7=Strongly Agree). Letters (a, b, c, ...) indicate pairs that are NOT significantly different from each other. For example, we did not find a statistically significant difference between Entertainment and Information, but did find a statistically significant difference between No Specific Goal and Social. For pairs with statistically significant difference, $p<.001$ for all of the pairs except between No Specific Goal and Social, where $p<.01$.
  • Figure 5: Top 10 Apps with the highest proportion of sessions where user-reported intended use was No Specific Goal, after filtering out apps that have been used less than 20 times overall.
  • ...and 6 more figures