Table of Contents
Fetching ...

Revisiting Early Detection of Sexual Predators via Turn-level Optimization

Jinmyeong An, Sangwon Ryu, Heejin Do, Yunsu Kim, Jungseul Ok, Gary Geunbae Lee

TL;DR

This work tackles the problem of early sexual predator detection by moving from chat-level supervision to turn-level risk labeling based on LCT, enabling finer-grained supervision of risky utterances. It introduces SCoRL, a speed-controlled reinforcement learning framework, which optimizes an MDP-based detection policy with a novel speed-control reward that balances intervention speed and accuracy, and it trains the detection head after an SFT stage. A new Turn-Level eSPD benchmark is proposed to evaluate early detection using turn-level risk signals and latency-aware metrics, addressing limitations of prior chat-level metrics. Empirical results on the PANC dataset show that SCoRL outperforms chat-level baselines in latency-weighted F1 and achieves more precise, timely detections aligned with grooming strategies, demonstrating practical potential for proactive online safety interventions.

Abstract

Online grooming is a severe social threat where sexual predators gradually entrap child victims with subtle and gradual manipulation. Therefore, timely intervention for online grooming is critical for proactive protection. However, previous methods fail to determine the optimal intervention points (i.e., jump to conclusions) as they rely on chat-level risk labels by causing weak supervision of risky utterances. For timely detection, we propose speed control reinforcement learning (SCoRL) (The code and supplementary materials are available at https://github.com/jinmyeongAN/SCoRL), incorporating a practical strategy derived from luring communication theory (LCT). To capture the predator's turn-level entrapment, we use a turn-level risk label based on the LCT. Then, we design a novel speed control reward function that balances the trade-off between speed and accuracy based on turn-level risk label; thus, SCoRL can identify the optimal intervention moment. In addition, we introduce a turn-level metric for precise evaluation, identifying limitations in previously used chat-level metrics. Experimental results show that SCoRL effectively preempted online grooming, offering a more proactive and timely solution. Further analysis reveals that our method enhances performance while intuitively identifying optimal early intervention points.

Revisiting Early Detection of Sexual Predators via Turn-level Optimization

TL;DR

This work tackles the problem of early sexual predator detection by moving from chat-level supervision to turn-level risk labeling based on LCT, enabling finer-grained supervision of risky utterances. It introduces SCoRL, a speed-controlled reinforcement learning framework, which optimizes an MDP-based detection policy with a novel speed-control reward that balances intervention speed and accuracy, and it trains the detection head after an SFT stage. A new Turn-Level eSPD benchmark is proposed to evaluate early detection using turn-level risk signals and latency-aware metrics, addressing limitations of prior chat-level metrics. Empirical results on the PANC dataset show that SCoRL outperforms chat-level baselines in latency-weighted F1 and achieves more precise, timely detections aligned with grooming strategies, demonstrating practical potential for proactive online safety interventions.

Abstract

Online grooming is a severe social threat where sexual predators gradually entrap child victims with subtle and gradual manipulation. Therefore, timely intervention for online grooming is critical for proactive protection. However, previous methods fail to determine the optimal intervention points (i.e., jump to conclusions) as they rely on chat-level risk labels by causing weak supervision of risky utterances. For timely detection, we propose speed control reinforcement learning (SCoRL) (The code and supplementary materials are available at https://github.com/jinmyeongAN/SCoRL), incorporating a practical strategy derived from luring communication theory (LCT). To capture the predator's turn-level entrapment, we use a turn-level risk label based on the LCT. Then, we design a novel speed control reward function that balances the trade-off between speed and accuracy based on turn-level risk label; thus, SCoRL can identify the optimal intervention moment. In addition, we introduce a turn-level metric for precise evaluation, identifying limitations in previously used chat-level metrics. Experimental results show that SCoRL effectively preempted online grooming, offering a more proactive and timely solution. Further analysis reveals that our method enhances performance while intuitively identifying optimal early intervention points.

Paper Structure

This paper contains 37 sections, 8 equations, 8 figures, 4 tables.

Figures (8)

  • Figure 1: Yellow arrows highlight the weakness of dialogue-level (chat-level) supervision in accurately identifying risky utterances. Previous approaches segment long conversations and assign the same dialogue-level label to each segment. However, risky utterances are often sparse within these segments, leading to mislabeling and increased false positives during normal conversations.
  • Figure 2: The training overview of SCoRL. In conversation $C$, the dialogue history $h_t$ up to the current time step $t$ is input sequentially. At each step, the Supervised Fine-Tuning (SFT) model is trained using the turn-level risk label $y_t^{turn}$ for all turns. Unlike SFT, the Reinforcement Learning (RL) process updates only the gradient of the detection head. When $a_t = 1$, early detection is triggered, and subsequent turns are excluded from training. A $p$ value is calculated for each conversation, and the model is updated based on the speed-control(SC) reward mechanism.
  • Figure 3: Cumulative graph showing the progression of three strategies—PI, A, and G—over time. The x-axis represents the number of turns in the conversation, while the y-axis indicates the cumulative sum of the strategies. The gray vertical line marks the average early detection point of the existing model, and the blue vertical line marks the average detection point achieved by SCoRL.
  • Figure 4: Strategy ratio graph for the detected utterance at the point of early detection for each model. The categories include: PI, G, A, and Others.
  • Figure 5: False positive rate for previous and our method. This highlights our method’s performance in minimizing false positives rate in low-risk scenarios.
  • ...and 3 more figures