Table of Contents
Fetching ...

Enhancing Large Language Models for Detecting Mental Manipulation via Annotation-Free Data Augmentation and Anti-Curriculum Distillation

Yuansheng Gao, Han Bao, Tong Zhang, Bin Li, Jixiang Luo, Ronghao Chen, Zonghui Wang, Wenzhi Chen

TL;DR

MentalMAC tackles the detection of mental manipulation in multi-turn dialogues by integrating annotation-free data augmentation (EvoSA), teacher-driven multi-task supervision, and task-level anti-curriculum distillation. It introduces ReaMent, a 5,000-real-world-dialogue dataset for robust evaluation. Across experiments, MentalMAC yields substantial gains over baselines and enables smaller models to approach or surpass large LLMs on this task, highlighting the practical potential of data-efficient, curriculum-aware training for covert manipulation detection.

Abstract

Mental manipulation is a subtle yet pervasive form of psychological abuse that poses serious threats to mental health. Nevertheless, detecting mental manipulation remains a largely underexplored research problem. The field faces three major challenges: (i) insufficient and hard-to-obtain training data; (ii) the covert nature of mental manipulation, which hinders detection; and (iii) the lack of real-world datasets. To address these challenges, we propose MentalMAC, a novel framework that enhances large language models' ability to detect elements of mental manipulation in multi-turn dialogue. Our approach consists of three key components: EvoSA, an annotation-free data augmentation method based on evolutionary operations and speech act theory; teacher-model-generated multi-task supervision; and progressive task-level anti-curriculum distillation. We then constructed the ReaMent dataset, comprising 5,000 real-world dialogue samples, utilizing MentalMAC-distilled models to aid in human annotation. Vast experiments show that MentalMAC achieves up to 25.9% improvement in F1mac and 8.1% in accuracy over the best-performing baseline, outperforming commercial LLMs such as GPT-4 and Claude-3.5-Sonnet. Warning: This paper contains content that may be offensive to the reader.

Enhancing Large Language Models for Detecting Mental Manipulation via Annotation-Free Data Augmentation and Anti-Curriculum Distillation

TL;DR

MentalMAC tackles the detection of mental manipulation in multi-turn dialogues by integrating annotation-free data augmentation (EvoSA), teacher-driven multi-task supervision, and task-level anti-curriculum distillation. It introduces ReaMent, a 5,000-real-world-dialogue dataset for robust evaluation. Across experiments, MentalMAC yields substantial gains over baselines and enables smaller models to approach or surpass large LLMs on this task, highlighting the practical potential of data-efficient, curriculum-aware training for covert manipulation detection.

Abstract

Mental manipulation is a subtle yet pervasive form of psychological abuse that poses serious threats to mental health. Nevertheless, detecting mental manipulation remains a largely underexplored research problem. The field faces three major challenges: (i) insufficient and hard-to-obtain training data; (ii) the covert nature of mental manipulation, which hinders detection; and (iii) the lack of real-world datasets. To address these challenges, we propose MentalMAC, a novel framework that enhances large language models' ability to detect elements of mental manipulation in multi-turn dialogue. Our approach consists of three key components: EvoSA, an annotation-free data augmentation method based on evolutionary operations and speech act theory; teacher-model-generated multi-task supervision; and progressive task-level anti-curriculum distillation. We then constructed the ReaMent dataset, comprising 5,000 real-world dialogue samples, utilizing MentalMAC-distilled models to aid in human annotation. Vast experiments show that MentalMAC achieves up to 25.9% improvement in F1mac and 8.1% in accuracy over the best-performing baseline, outperforming commercial LLMs such as GPT-4 and Claude-3.5-Sonnet. Warning: This paper contains content that may be offensive to the reader.

Paper Structure

This paper contains 23 sections, 12 equations, 5 figures, 4 tables.

Figures (5)

  • Figure 1: Comparative example illustrating our method in comparison with GPT-4 achiam2023gpt using intent-aware prompting ma-etal-2025-detecting. Person2 (right) is the manipulator, while Person1 (left) is the target.
  • Figure 2: Overall workflow of the proposed MentalMAC. Stage 1: We propose EvoSA and use it to expand the training dataset in an annotation-free manner. Stage 2: Use a teacher model to produce correct and incorrect rationales, and generate corrective feedback for the latter, constructing multi‐task training data. Stage 3: We implement task-level anti-curriculum distillation, progressing from hard to easy, on the student model.
  • Figure 3: EvoSA prompts for dialogue generation. To enable annotation-free synthesis, Dialogue 1 and Dialogue 2 are randomly selected as parent dialogues with the same label. An initial child dialogue is generated using selection, crossover, and mutation, where orange text guides the LLM to focus on speech acts and dialogue elements. The LLM then polishes the child dialogue, analyzes the presence or absence of mental manipulation in the parent dialogues, and revises the child dialogue to ensure consistency with the parent label (blue text reflects the parent label).
  • Figure 4: An example of rationales for dialogues that include (“Rationale w”) and do not include (“Rationale w/o”) elements of mental manipulation.
  • Figure 5: Relative improvements in our method compared to the best student baseline.