Table of Contents
Fetching ...

A Fuzzy Evaluation of Sentence Encoders on Grooming Risk Classification

Geetanjali Bihani, Julia Rayz

TL;DR

The paper tackles grooming risk classification in online chats by adopting a fuzzy framework that maps human assessments of grooming behaviors to three severity levels. It evaluates Transformer bi-encoders (SBERT, MPNet, RoBERTa) across three participant groups—law enforcement officers, real victims, and decoys—and uses defuzzification with an alpha-cut to derive discrete risk categories. Key findings show that fine-tuned models struggle to detect indirect and coded grooming language, especially in decoy data, and performance deteriorates with higher-risk contexts likely due to increased OOV tokens. The study highlights the need for robust models capable of recognizing indirect speech acts and coded language, with implications for dataset design and deployment in real-world grooming risk surveillance.

Abstract

With the advent of social media, children are becoming increasingly vulnerable to the risk of grooming in online settings. Detecting grooming instances in an online conversation poses a significant challenge as the interactions are not necessarily sexually explicit, since the predators take time to build trust and a relationship with their victim. Moreover, predators evade detection using indirect and coded language. While previous studies have fine-tuned Transformers to automatically identify grooming in chat conversations, they overlook the impact of coded and indirect language on model predictions, and how these align with human perceptions of grooming. In this paper, we address this gap and evaluate bi-encoders on the task of classifying different degrees of grooming risk in chat contexts, for three different participant groups, i.e. law enforcement officers, real victims, and decoys. Using a fuzzy-theoretic framework, we map human assessments of grooming behaviors to estimate the actual degree of grooming risk. Our analysis reveals that fine-tuned models fail to tag instances where the predator uses indirect speech pathways and coded language to evade detection. Further, we find that such instances are characterized by a higher presence of out-of-vocabulary (OOV) words in samples, causing the model to misclassify. Our findings highlight the need for more robust models to identify coded language from noisy chat inputs in grooming contexts.

A Fuzzy Evaluation of Sentence Encoders on Grooming Risk Classification

TL;DR

The paper tackles grooming risk classification in online chats by adopting a fuzzy framework that maps human assessments of grooming behaviors to three severity levels. It evaluates Transformer bi-encoders (SBERT, MPNet, RoBERTa) across three participant groups—law enforcement officers, real victims, and decoys—and uses defuzzification with an alpha-cut to derive discrete risk categories. Key findings show that fine-tuned models struggle to detect indirect and coded grooming language, especially in decoy data, and performance deteriorates with higher-risk contexts likely due to increased OOV tokens. The study highlights the need for robust models capable of recognizing indirect speech acts and coded language, with implications for dataset design and deployment in real-world grooming risk surveillance.

Abstract

With the advent of social media, children are becoming increasingly vulnerable to the risk of grooming in online settings. Detecting grooming instances in an online conversation poses a significant challenge as the interactions are not necessarily sexually explicit, since the predators take time to build trust and a relationship with their victim. Moreover, predators evade detection using indirect and coded language. While previous studies have fine-tuned Transformers to automatically identify grooming in chat conversations, they overlook the impact of coded and indirect language on model predictions, and how these align with human perceptions of grooming. In this paper, we address this gap and evaluate bi-encoders on the task of classifying different degrees of grooming risk in chat contexts, for three different participant groups, i.e. law enforcement officers, real victims, and decoys. Using a fuzzy-theoretic framework, we map human assessments of grooming behaviors to estimate the actual degree of grooming risk. Our analysis reveals that fine-tuned models fail to tag instances where the predator uses indirect speech pathways and coded language to evade detection. Further, we find that such instances are characterized by a higher presence of out-of-vocabulary (OOV) words in samples, causing the model to misclassify. Our findings highlight the need for more robust models to identify coded language from noisy chat inputs in grooming contexts.

Paper Structure

This paper contains 8 sections, 5 equations, 2 figures, 3 tables.

Figures (2)

  • Figure 1: Presence of grooming strategies present within a given chat context used to create grooming risk categories
  • Figure 2: Confusion matrices illustrating the classification performance of the models across different participant groups: law enforcement officers (LEO), victims, and decoys. Each heatmap represents the comparison between actual and predicted labels for grooming risk categories (moderate, significant, severe). The top label refers to the participant group on which the model was fine-tuned. The color intensity denotes the count of instances falling into each category, with blue shades indicating higher counts, and red shades indicating lower counts.