Towards Safer Online Spaces: Simulating and Assessing Intervention Strategies for Eating Disorder Discussions

Louis Penafiel; Hsien-Te Kao; Isabel Erickson; David Chu; Robert McCormack; Kristina Lerman; Svitlana Volkova

Towards Safer Online Spaces: Simulating and Assessing Intervention Strategies for Eating Disorder Discussions

Louis Penafiel, Hsien-Te Kao, Isabel Erickson, David Chu, Robert McCormack, Kristina Lerman, Svitlana Volkova

TL;DR

This work tackles the safety challenges of testing interventions in ED-related online discussions by introducing a novel LLM-driven experimental testbed that simulates thousands of synthetic conversations across Reddit, Twitter, and ED Forums. It uses a structured pipeline of data collection, cluster-based prompting, varied intervention strategies, and multiple generative models to evaluate effects with cognitive-domain analytics, primarily sentiment via SST-2. Key findings show civility-focused interventions consistently improve sentiment, while insight-resetting strategies can increase negative emotions, with notable biases across models and versions that affect realism and applicability. The study underscores the importance of tailoring intervention approaches to platform, community, and model characteristics and discusses ethical safeguards and limitations for deploying such tools in civilian and military contexts.

Abstract

Eating disorders are complex mental health conditions that affect millions of people around the world. Effective interventions on social media platforms are crucial, yet testing strategies in situ can be risky. We present a novel LLM-driven experimental testbed for simulating and assessing intervention strategies in ED-related discussions. Our framework generates synthetic conversations across multiple platforms, models, and ED-related topics, allowing for controlled experimentation with diverse intervention approaches. We analyze the impact of various intervention strategies on conversation dynamics across four dimensions: intervention type, generative model, social media platform, and ED-related community/topic. We employ cognitive domain analysis metrics, including sentiment, emotions, etc., to evaluate the effectiveness of interventions. Our findings reveal that civility-focused interventions consistently improve positive sentiment and emotional tone across all dimensions, while insight-resetting approaches tend to increase negative emotions. We also uncover significant biases in LLM-generated conversations, with cognitive metrics varying notably between models (Claude-3 Haiku $>$ Mistral $>$ GPT-3.5-turbo $>$ LLaMA3) and even between versions of the same model. These variations highlight the importance of model selection in simulating realistic discussions related to ED. Our work provides valuable information on the complex dynamics of ED-related discussions and the effectiveness of various intervention strategies.

Towards Safer Online Spaces: Simulating and Assessing Intervention Strategies for Eating Disorder Discussions

TL;DR

Abstract

Mistral

GPT-3.5-turbo

LLaMA3) and even between versions of the same model. These variations highlight the importance of model selection in simulating realistic discussions related to ED. Our work provides valuable information on the complex dynamics of ED-related discussions and the effectiveness of various intervention strategies.

Paper Structure (17 sections, 5 figures, 3 tables)

This paper contains 17 sections, 5 figures, 3 tables.

Introduction
Methodology
LLM-driven Experimental Testbed
Datasets
Prompting Strategies
Intervention Strategies
Generative Models
Measuring the Effect of Interventions using Cognitive Domain Analytics
Experimental Setup
Results and Discussion
Cross-Model Analysis
Cross-Platform Analysis
Cross-Community Analysis
Impact
Conclusions
...and 2 more sections

Figures (5)

Figure 1: Cross-model analysis: Comparing intervention effectiveness across different mediation strategies using average sentiment scores over simulated conversations by each LLM (colors represents intervention strategies).
Figure 2: Cross-model analysis: Comparing intervention effectiveness across different mediation strategies using empathy intent metrics across LLM.
Figure 3: Cross-model analysis: Comparing intervention effectiveness across different mediation strategies using empathy emotion metrics across LLM.
Figure 4: Cross-platform analysis: Comparing intervention effectiveness across different mediation strategies using average sentiment scores over simulated conversations grouped by top 2 communities on each platform.
Figure 5: Cross-community analysis on Twitter: Comparing intervention effectiveness across different mediation strategies across communities on Twitter using average sentiment scores over simulated conversations.

Towards Safer Online Spaces: Simulating and Assessing Intervention Strategies for Eating Disorder Discussions

TL;DR

Abstract

Towards Safer Online Spaces: Simulating and Assessing Intervention Strategies for Eating Disorder Discussions

Authors

TL;DR

Abstract

Table of Contents

Figures (5)