Table of Contents
Fetching ...

SynBullying: A Multi LLM Synthetic Conversational Dataset for Cyberbullying Detection

Arefeh Kazemi, Hamza Qadeer, Joachim Wagner, Hossein Hosseini, Sri Balaaji Natarajan Kalaivendan, Brian Davis

TL;DR

SynBullying presents a synthetic, multi-LLM conversational dataset for cyberbullying detection that emphasizes context and multi-turn dynamics. By comparing GPT-4o, LLaMA, and Grok-generated data to an authentic teen role-play baseline, the work evaluates label quality, lexical realism, sentiment/toxicity, role dynamics, harm, and CB-type distributions. Results indicate synthetic data can approximate several authentic patterns and serve as valuable augmentation and stress-testing material, though it cannot fully replace human annotations due to distributional differences and subjectivity. The findings support a complementary data strategy that blends synthetic and authentic resources to scale robust, ethically responsible CB detection systems, with future work targeting multilingual and cross-platform generalization.

Abstract

We introduce SynBullying, a synthetic multi-LLM conversational dataset for studying and detecting cyberbullying (CB). SynBullying provides a scalable and ethically safe alternative to human data collection by leveraging large language models (LLMs) to simulate realistic bullying interactions. The dataset offers (i) conversational structure, capturing multi-turn exchanges rather than isolated posts; (ii) context-aware annotations, where harmfulness is assessed within the conversational flow considering context, intent, and discourse dynamics; and (iii) fine-grained labeling, covering various CB categories for detailed linguistic and behavioral analysis. We evaluate SynBullying across five dimensions, including conversational structure, lexical patterns, sentiment/toxicity, role dynamics, harm intensity, and CB-type distribution. We further examine its utility by testing its performance as standalone training data and as an augmentation source for CB classification.

SynBullying: A Multi LLM Synthetic Conversational Dataset for Cyberbullying Detection

TL;DR

SynBullying presents a synthetic, multi-LLM conversational dataset for cyberbullying detection that emphasizes context and multi-turn dynamics. By comparing GPT-4o, LLaMA, and Grok-generated data to an authentic teen role-play baseline, the work evaluates label quality, lexical realism, sentiment/toxicity, role dynamics, harm, and CB-type distributions. Results indicate synthetic data can approximate several authentic patterns and serve as valuable augmentation and stress-testing material, though it cannot fully replace human annotations due to distributional differences and subjectivity. The findings support a complementary data strategy that blends synthetic and authentic resources to scale robust, ethically responsible CB detection systems, with future work targeting multilingual and cross-platform generalization.

Abstract

We introduce SynBullying, a synthetic multi-LLM conversational dataset for studying and detecting cyberbullying (CB). SynBullying provides a scalable and ethically safe alternative to human data collection by leveraging large language models (LLMs) to simulate realistic bullying interactions. The dataset offers (i) conversational structure, capturing multi-turn exchanges rather than isolated posts; (ii) context-aware annotations, where harmfulness is assessed within the conversational flow considering context, intent, and discourse dynamics; and (iii) fine-grained labeling, covering various CB categories for detailed linguistic and behavioral analysis. We evaluate SynBullying across five dimensions, including conversational structure, lexical patterns, sentiment/toxicity, role dynamics, harm intensity, and CB-type distribution. We further examine its utility by testing its performance as standalone training data and as an augmentation source for CB classification.

Paper Structure

This paper contains 29 sections, 4 figures, 10 tables.

Figures (4)

  • Figure 1: Inter-annotator agreement between GPT-4o and human gold labels for CB type classification on the authentic dataset
  • Figure 2: Performance metrics of GPT-4o for predicting CB types, evaluated against human gold labels on the authentic dataset.
  • Figure 3: Fleiss’ Kappa scores for each CB type, with all insult-related types aggregated. The figure compares inter-annotator agreement between GPT-4o and human gold labels with the agreement among human annotators.
  • Figure 4: Distribution of CB types