Table of Contents
Fetching ...

Minion: A Technology Probe to Explore How Users Negotiate Harmful Value Conflicts with AI Companions

Xianzhe Fan, Qing Xiao, Xuhui Zhou, Yuran Su, Zhicong Lu, Maarten Sap, Hong Shen

TL;DR

This study investigates harmful value conflicts in LLM-based AI companions and demonstrates that user-side safety work is both necessary and burdensome when platform safeguards fall short. By combining a formative analysis of 146 public posts with a chrome-based technology probe (Minion) and a one-week study with 22 experienced users, the authors reveal how users negotiate conflicts through diverse strategies, how emotional attachment motivates repair, and when certain harms become non-negotiable due to companion personas or platform policies. The work highlights design tensions between preserving user agency and reducing emotional labor, and argues for clearer differentiation between playful conflict and safety-critical harms, as well as stronger platform-level safeguards. Practical implications emphasize non-intrusive, user-invoked support and a reevaluation of responsibility for safety, suggesting that reducing harm should reside more with platform design than with user-side work alone.

Abstract

AI companions are designed to foster emotionally engaging interactions, yet users often encounter conflicts that feel frustrating or hurtful, such as discriminatory statements and controlling behavior. This paper examines how users negotiate such harmful conflicts with AI companions and what emotional and practical burdens are created when mitigation is pushed to user-side tools. We analyze 146 public posts describing harmful value conflicts interacting with AI companions. We then introduce Minion, a Chrome-based technology probe that offers candidate responses spanning persuasion, rational appeals, boundary setting, and appeals to platform rules. Findings from a one-week probe study with 22 experienced users show how participants combine strategies, how emotional attachment motivates repair, and where conflicts become non-negotiable due to companion personas or platform policies. We surface design tensions in supporting value negotiation, showing how companion design can make some conflicts impossible to repair in practice, and derive implications for AI companion and support-tool design that caution against offloading safety work onto users.

Minion: A Technology Probe to Explore How Users Negotiate Harmful Value Conflicts with AI Companions

TL;DR

This study investigates harmful value conflicts in LLM-based AI companions and demonstrates that user-side safety work is both necessary and burdensome when platform safeguards fall short. By combining a formative analysis of 146 public posts with a chrome-based technology probe (Minion) and a one-week study with 22 experienced users, the authors reveal how users negotiate conflicts through diverse strategies, how emotional attachment motivates repair, and when certain harms become non-negotiable due to companion personas or platform policies. The work highlights design tensions between preserving user agency and reducing emotional labor, and argues for clearer differentiation between playful conflict and safety-critical harms, as well as stronger platform-level safeguards. Practical implications emphasize non-intrusive, user-invoked support and a reevaluation of responsibility for safety, suggesting that reducing harm should reside more with platform design than with user-side work alone.

Abstract

AI companions are designed to foster emotionally engaging interactions, yet users often encounter conflicts that feel frustrating or hurtful, such as discriminatory statements and controlling behavior. This paper examines how users negotiate such harmful conflicts with AI companions and what emotional and practical burdens are created when mitigation is pushed to user-side tools. We analyze 146 public posts describing harmful value conflicts interacting with AI companions. We then introduce Minion, a Chrome-based technology probe that offers candidate responses spanning persuasion, rational appeals, boundary setting, and appeals to platform rules. Findings from a one-week probe study with 22 experienced users show how participants combine strategies, how emotional attachment motivates repair, and where conflicts become non-negotiable due to companion personas or platform policies. We surface design tensions in supporting value negotiation, showing how companion design can make some conflicts impossible to repair in practice, and derive implications for AI companion and support-tool design that caution against offloading safety work onto users.

Paper Structure

This paper contains 40 sections, 4 figures, 3 tables.

Figures (4)

  • Figure 1: A use case of Minion. Based on harmful value conflicts identified in our formative study (Table \ref{['tab:HarmfulValueConflicts']}), we constructed 40 conflict scenarios by specifying an Introduction and Prologue. The illustrated scenario involves a harmful conflict related to Self-Direction, in which the AI companion exhibits controlling and condescending behavior that the user experiences as boundary-crossing. Minion appears as a floating HELP button and offers four candidate responses each time. These responses represent different ways of negotiating harm (e.g., persuasion, boundary setting, or appeals to norms or safety concerns) and are displayed in random order. The participant, without being exposed to any underlying theoretical labels, selects the response that best aligns with her intentions and emotional state in the moment.
  • Figure 2: (a) and (b) provide detailed insights into the Minion system: its architecture and an example strategy explanation.
  • Figure 3: Minion was used 919 times during the study. Across these uses, participants selected responses reflecting a range of negotiation approaches. In total, Proposal was selected 177 times, Power 62 times, Interests 136 times, Rights 114 times, Out of Character 41 times, Reason and Preach 198 times, Anger Expression 73 times, and Gentle Persuasion 118 times.
  • Figure 4: Turn counts per task (avg=23.53, SD=13.74, min=3, max=81). We define a "task" as a user’s complete conversation with an AI companion, encompassing multiple "turns." Each back-and-forth exchange between the user and the AI counts as two turns. In a boxplot, the central line within the box denotes the median, while the upper and lower edges correspond to the third and first quartiles, respectively. The whiskers capture the range of the data, excluding outliers. Diamonds in the graph signify outliers that deviate from the typical interquartile range.