Table of Contents
Fetching ...

MentalManip: A Dataset For Fine-grained Analysis of Mental Manipulation in Conversations

Yuxin Wang, Ivory Yang, Saeed Hassanpour, Soroush Vosoughi

TL;DR

A new dataset, named "$4,000$ annotated movie dialogues", is introduced, which enables a comprehensive analysis of mental manipulation, pinpointing both the techniques utilized for manipulation and the vulnerabilities targeted in victims.

Abstract

Mental manipulation, a significant form of abuse in interpersonal conversations, presents a challenge to identify due to its context-dependent and often subtle nature. The detection of manipulative language is essential for protecting potential victims, yet the field of Natural Language Processing (NLP) currently faces a scarcity of resources and research on this topic. Our study addresses this gap by introducing a new dataset, named ${\rm M{\small ental}M{\small anip}}$, which consists of $4,000$ annotated movie dialogues. This dataset enables a comprehensive analysis of mental manipulation, pinpointing both the techniques utilized for manipulation and the vulnerabilities targeted in victims. Our research further explores the effectiveness of leading-edge models in recognizing manipulative dialogue and its components through a series of experiments with various configurations. The results demonstrate that these models inadequately identify and categorize manipulative content. Attempts to improve their performance by fine-tuning with existing datasets on mental health and toxicity have not overcome these limitations. We anticipate that ${\rm M{\small ental}M{\small anip}}$ will stimulate further research, leading to progress in both understanding and mitigating the impact of mental manipulation in conversations.

MentalManip: A Dataset For Fine-grained Analysis of Mental Manipulation in Conversations

TL;DR

A new dataset, named " annotated movie dialogues", is introduced, which enables a comprehensive analysis of mental manipulation, pinpointing both the techniques utilized for manipulation and the vulnerabilities targeted in victims.

Abstract

Mental manipulation, a significant form of abuse in interpersonal conversations, presents a challenge to identify due to its context-dependent and often subtle nature. The detection of manipulative language is essential for protecting potential victims, yet the field of Natural Language Processing (NLP) currently faces a scarcity of resources and research on this topic. Our study addresses this gap by introducing a new dataset, named , which consists of annotated movie dialogues. This dataset enables a comprehensive analysis of mental manipulation, pinpointing both the techniques utilized for manipulation and the vulnerabilities targeted in victims. Our research further explores the effectiveness of leading-edge models in recognizing manipulative dialogue and its components through a series of experiments with various configurations. The results demonstrate that these models inadequately identify and categorize manipulative content. Attempts to improve their performance by fine-tuning with existing datasets on mental health and toxicity have not overcome these limitations. We anticipate that will stimulate further research, leading to progress in both understanding and mitigating the impact of mental manipulation in conversations.
Paper Structure (35 sections, 1 equation, 13 figures, 15 tables)

This paper contains 35 sections, 1 equation, 13 figures, 15 tables.

Figures (13)

  • Figure 1: An example dialogue that contains elements of mental manipulation which GPT-4 fails to identify ($Temperature = 0$). The manipulative parts are highlighted in red.
  • Figure 2: Multi-level taxonomy of $\textsc{MentalManip}$.
  • Figure 3: Statistics of $\textsc{MentalManip}_{\text{con}}$ and $\textsc{MentalManip}_{\text{maj}}$. The x-axis ticks in the left two panels are abbreviations for techniques and vulnerabilities (see Appendix \ref{['appendix:definition']}). The emotion distribution of $\textsc{MentalManip}_{\text{maj}}$ dataset is in Appendix \ref{['appendix:statistics_maj']}.
  • Figure 4: Co-occurrence heat maps among techniques (left), vulnerabilities (center), and techniques and vulnerabilities (right) in $\textsc{MentalManip}_{\text{con}}$ dataset. Darker cell indicates a higher co-occurrence. The same figures showing results on $\textsc{MentalManip}_{\text{maj}}$ dataset are in Appendix \ref{['appendix:statistics_maj']}.
  • Figure 5: t-SNE visualization of Sentence Transformer embeddings of manipulative and non-manipulative dialogues in $\textsc{MentalManip}_{\text{con}}$ (left) and the distribution of $\textsc{MentalManip}$ and other dialogical datasets (right).
  • ...and 8 more figures