Table of Contents
Fetching ...

Enhanced Online Grooming Detection Employing Context Determination and Message-Level Analysis

Jake Street, Isibor Ihianle, Funminiyi Olajide, Ahmad Lotfi

TL;DR

This work tackles Online Grooming detection under encryption by combining Message-Level Analysis (MLA) using transformer models with a Context Determination framework that classifies actor interactions via Thresholds (AST) and Message Thresholds ($t$). The approach targets robust cross-dataset performance by evaluating on PJ and PAN12 and introducing per-messageAge/Child scoring and transcript-level AC context determination. Transformer models (BERT, RoBERTa) outperform traditional ML in both MLA and Context Determination, achieving high F1 scores (up to ~0.97) on PJ and strong but dataset-skewed results on PAN12. The findings suggest context-aware, transformer-based detection can improve real-time OG mitigation, with future work extending to group chats, CCSO/FSCO differentiation, and additional behavioral insights for broader applicability.

Abstract

Online Grooming (OG) is a prevalent threat facing predominately children online, with groomers using deceptive methods to prey on the vulnerability of children on social media/messaging platforms. These attacks can have severe psychological and physical impacts, including a tendency towards revictimization. Current technical measures are inadequate, especially with the advent of end-to-end encryption which hampers message monitoring. Existing solutions focus on the signature analysis of child abuse media, which does not effectively address real-time OG detection. This paper proposes that OG attacks are complex, requiring the identification of specific communication patterns between adults and children. It introduces a novel approach leveraging advanced models such as BERT and RoBERTa for Message-Level Analysis and a Context Determination approach for classifying actor interactions, including the introduction of Actor Significance Thresholds and Message Significance Thresholds. The proposed method aims to enhance accuracy and robustness in detecting OG by considering the dynamic and multi-faceted nature of these attacks. Cross-dataset experiments evaluate the robustness and versatility of our approach. This paper's contributions include improved detection methodologies and the potential for application in various scenarios, addressing gaps in current literature and practices.

Enhanced Online Grooming Detection Employing Context Determination and Message-Level Analysis

TL;DR

This work tackles Online Grooming detection under encryption by combining Message-Level Analysis (MLA) using transformer models with a Context Determination framework that classifies actor interactions via Thresholds (AST) and Message Thresholds (). The approach targets robust cross-dataset performance by evaluating on PJ and PAN12 and introducing per-messageAge/Child scoring and transcript-level AC context determination. Transformer models (BERT, RoBERTa) outperform traditional ML in both MLA and Context Determination, achieving high F1 scores (up to ~0.97) on PJ and strong but dataset-skewed results on PAN12. The findings suggest context-aware, transformer-based detection can improve real-time OG mitigation, with future work extending to group chats, CCSO/FSCO differentiation, and additional behavioral insights for broader applicability.

Abstract

Online Grooming (OG) is a prevalent threat facing predominately children online, with groomers using deceptive methods to prey on the vulnerability of children on social media/messaging platforms. These attacks can have severe psychological and physical impacts, including a tendency towards revictimization. Current technical measures are inadequate, especially with the advent of end-to-end encryption which hampers message monitoring. Existing solutions focus on the signature analysis of child abuse media, which does not effectively address real-time OG detection. This paper proposes that OG attacks are complex, requiring the identification of specific communication patterns between adults and children. It introduces a novel approach leveraging advanced models such as BERT and RoBERTa for Message-Level Analysis and a Context Determination approach for classifying actor interactions, including the introduction of Actor Significance Thresholds and Message Significance Thresholds. The proposed method aims to enhance accuracy and robustness in detecting OG by considering the dynamic and multi-faceted nature of these attacks. Cross-dataset experiments evaluate the robustness and versatility of our approach. This paper's contributions include improved detection methodologies and the potential for application in various scenarios, addressing gaps in current literature and practices.
Paper Structure (19 sections, 1 equation, 8 figures, 10 tables, 1 algorithm)

This paper contains 19 sections, 1 equation, 8 figures, 10 tables, 1 algorithm.

Figures (8)

  • Figure 1: Methodology context.
  • Figure 2: Message Level Analysis (MLA) process showing Inter-Set and Cross-Set approach.
  • Figure 3: Context determination process with AST and $t$ value application.
  • Figure 4: Example of a transcript for Formatted Processing (user 'DavieWants2' from PJ).
  • Figure 5: Example of a transcript for Name Processing (user 'ArmySgt1961' from PJ).
  • ...and 3 more figures