Enhanced Online Grooming Detection Employing Context Determination and Message-Level Analysis
Jake Street, Isibor Ihianle, Funminiyi Olajide, Ahmad Lotfi
TL;DR
This work tackles Online Grooming detection under encryption by combining Message-Level Analysis (MLA) using transformer models with a Context Determination framework that classifies actor interactions via Thresholds (AST) and Message Thresholds ($t$). The approach targets robust cross-dataset performance by evaluating on PJ and PAN12 and introducing per-messageAge/Child scoring and transcript-level AC context determination. Transformer models (BERT, RoBERTa) outperform traditional ML in both MLA and Context Determination, achieving high F1 scores (up to ~0.97) on PJ and strong but dataset-skewed results on PAN12. The findings suggest context-aware, transformer-based detection can improve real-time OG mitigation, with future work extending to group chats, CCSO/FSCO differentiation, and additional behavioral insights for broader applicability.
Abstract
Online Grooming (OG) is a prevalent threat facing predominately children online, with groomers using deceptive methods to prey on the vulnerability of children on social media/messaging platforms. These attacks can have severe psychological and physical impacts, including a tendency towards revictimization. Current technical measures are inadequate, especially with the advent of end-to-end encryption which hampers message monitoring. Existing solutions focus on the signature analysis of child abuse media, which does not effectively address real-time OG detection. This paper proposes that OG attacks are complex, requiring the identification of specific communication patterns between adults and children. It introduces a novel approach leveraging advanced models such as BERT and RoBERTa for Message-Level Analysis and a Context Determination approach for classifying actor interactions, including the introduction of Actor Significance Thresholds and Message Significance Thresholds. The proposed method aims to enhance accuracy and robustness in detecting OG by considering the dynamic and multi-faceted nature of these attacks. Cross-dataset experiments evaluate the robustness and versatility of our approach. This paper's contributions include improved detection methodologies and the potential for application in various scenarios, addressing gaps in current literature and practices.
