Table of Contents
Fetching ...

Identifying Cyberbullying Roles in Social Media

Manuel Sandoval, Mohammed Abuhamad, Patrick Furman, Mujtaba Nazari, Deborah L. Hall, Yasin N. Silva

TL;DR

The paper tackles the problem of identifying cyberbullying roles (Harasser, Victim, Bystander Defender, Bystander Assistant, Bystander Other) in social media by building machine learning models that leverage context–target Q&A transformations and address severe class imbalance via ADASYN oversampling on the AMiCA dataset. It systematically compares four LLM backbones (BERT, RoBERTa, T5, GPT-2) with classical classifiers and fine-tuned LLM heads, using a 10-fold cross-validation framework and a $25^{th}$ percentile victim-confidence threshold to improve precision. The main finding is that oversampling combined with RoBERTa fine-tuning yields the strongest results (overall F1 $=83.5\%$, top-2 F1 $=95.7\%$; $F1$ rises to $89.3\%$ with a threshold, $RR=16.4\%$), while performance is highest for well-represented classes and declines for scarce or contextually ambiguous ones. The work highlights strengths and limitations of current methods on limited data and outlines directions for richer labeling, multi-label role assignments, and cross-platform applicability. Overall, the study advances cyberbullying role detection by integrating data augmentation, contextual embeddings, and thresholding to improve multi-class classification in a high-stakes social computing domain.

Abstract

Social media has revolutionized communication, allowing people worldwide to connect and interact instantly. However, it has also led to increases in cyberbullying, which poses a significant threat to children and adolescents globally, affecting their mental health and well-being. It is critical to accurately detect the roles of individuals involved in cyberbullying incidents to effectively address the issue on a large scale. This study explores the use of machine learning models to detect the roles involved in cyberbullying interactions. After examining the AMiCA dataset and addressing class imbalance issues, we evaluate the performance of various models built with four underlying LLMs (i.e., BERT, RoBERTa, T5, and GPT-2) for role detection. Our analysis shows that oversampling techniques help improve model performance. The best model, a fine-tuned RoBERTa using oversampled data, achieved an overall F1 score of 83.5%, increasing to 89.3% after applying a prediction threshold. The top-2 F1 score without thresholding was 95.7%. Our method outperforms previously proposed models. After investigating the per-class model performance and confidence scores, we show that the models perform well in classes with more samples and less contextual confusion (e.g., Bystander Other), but struggle with classes with fewer samples (e.g., Bystander Assistant) and more contextual ambiguity (e.g., Harasser and Victim). This work highlights current strengths and limitations in the development of accurate models with limited data and complex scenarios.

Identifying Cyberbullying Roles in Social Media

TL;DR

The paper tackles the problem of identifying cyberbullying roles (Harasser, Victim, Bystander Defender, Bystander Assistant, Bystander Other) in social media by building machine learning models that leverage context–target Q&A transformations and address severe class imbalance via ADASYN oversampling on the AMiCA dataset. It systematically compares four LLM backbones (BERT, RoBERTa, T5, GPT-2) with classical classifiers and fine-tuned LLM heads, using a 10-fold cross-validation framework and a percentile victim-confidence threshold to improve precision. The main finding is that oversampling combined with RoBERTa fine-tuning yields the strongest results (overall F1 , top-2 F1 ; rises to with a threshold, ), while performance is highest for well-represented classes and declines for scarce or contextually ambiguous ones. The work highlights strengths and limitations of current methods on limited data and outlines directions for richer labeling, multi-label role assignments, and cross-platform applicability. Overall, the study advances cyberbullying role detection by integrating data augmentation, contextual embeddings, and thresholding to improve multi-class classification in a high-stakes social computing domain.

Abstract

Social media has revolutionized communication, allowing people worldwide to connect and interact instantly. However, it has also led to increases in cyberbullying, which poses a significant threat to children and adolescents globally, affecting their mental health and well-being. It is critical to accurately detect the roles of individuals involved in cyberbullying incidents to effectively address the issue on a large scale. This study explores the use of machine learning models to detect the roles involved in cyberbullying interactions. After examining the AMiCA dataset and addressing class imbalance issues, we evaluate the performance of various models built with four underlying LLMs (i.e., BERT, RoBERTa, T5, and GPT-2) for role detection. Our analysis shows that oversampling techniques help improve model performance. The best model, a fine-tuned RoBERTa using oversampled data, achieved an overall F1 score of 83.5%, increasing to 89.3% after applying a prediction threshold. The top-2 F1 score without thresholding was 95.7%. Our method outperforms previously proposed models. After investigating the per-class model performance and confidence scores, we show that the models perform well in classes with more samples and less contextual confusion (e.g., Bystander Other), but struggle with classes with fewer samples (e.g., Bystander Assistant) and more contextual ambiguity (e.g., Harasser and Victim). This work highlights current strengths and limitations in the development of accurate models with limited data and complex scenarios.

Paper Structure

This paper contains 7 sections, 3 figures, 5 tables.

Figures (3)

  • Figure 1: Modeling Pipeline: Dataset is processed, and samples are transformed to target/context form and then processed to handle class imbalance. Using 10-fold cross-validation, LLMs are employed/evaluated using various methods for role detection.
  • Figure 2: ECDF of comment lengths using various tokenizers with zoomed-in chart in the center. The max-length is set as the $99$-th percentile, i.e., 103, 101, 121, and 101 tokens for RoBERTa, GPT2, T5, and BERT tokenizers, respectively.
  • Figure 3: ECDF of model confidence for correctly identified victim comments for fine-tuned LLMs on oversampled data. $25$-th percentile thresholds are: $0.7863$, $0.6737$, $0.4198$, and $0.5611$ for RoBERTa, GPT2, T5, and BERT, respectively.