Detecting sexually explicit content in the context of the child sexual abuse materials (CSAM): end-to-end classifiers and region-based networks

Weronika Gutfeter; Joanna Gajewska; Andrzej Pacut

Detecting sexually explicit content in the context of the child sexual abuse materials (CSAM): end-to-end classifiers and region-based networks

Weronika Gutfeter, Joanna Gajewska, Andrzej Pacut

TL;DR

Several approaches are explored to solve the task of classifying sexually explicit content, which plays a crucial role in the automated CSAM detection system: an end-to-end classifier, a classifier with person detection and a private body parts detector.

Abstract

Child sexual abuse materials (CSAM) pose a significant threat to the safety and well-being of children worldwide. Detecting and preventing the distribution of such materials is a critical task for law enforcement agencies and technology companies. As content moderation is often manual, developing an automated detection system can help reduce human reviewers' exposure to potentially harmful images and accelerate the process of counteracting. This study presents methods for classifying sexually explicit content, which plays a crucial role in the automated CSAM detection system. Several approaches are explored to solve the task: an end-to-end classifier, a classifier with person detection and a private body parts detector. All proposed methods are tested on the images obtained from the online tool for reporting illicit content. Due to legal constraints, access to the data is limited, and all algorithms are executed remotely on the isolated server. The end-to-end classifier yields the most promising results, with an accuracy of 90.17%, after augmenting the training set with the additional neutral samples and adult pornography. While detection-based methods may not achieve higher accuracy rates and cannot serve as a final classifier on their own, their inclusion in the system can be beneficial. Human body-oriented approaches generate results that are easier to interpret, and obtaining more interpretable results is essential when analyzing models that are trained without direct access to data.

Detecting sexually explicit content in the context of the child sexual abuse materials (CSAM): end-to-end classifiers and region-based networks

TL;DR

Abstract

Paper Structure (9 sections, 1 equation, 3 figures, 6 tables)

This paper contains 9 sections, 1 equation, 3 figures, 6 tables.

Introduction
CSAM detection methods
Source of illegal images
Proposed CSAM classification schema
Sexually-explicit content classification: model architectures and training
Pretraining model with external data
Person detection
Private body parts detection
Conclusions

Figures (3)

Figure 1: Decision schema that leads to the final classification in the proposed CSAM detection system. A complete system consists of two models: one performs age estimation and the second predicts whether the image contains sexually explicit elements (SE classification). Two-stage classification enables the system to distinguish between CSAM, adult pornography and neutral images.
Figure 2: Distinguishing between images that contain sexually explicit content (SE) and not-sexual ones (NS) is sufficient for the system to detect CSAM. However, we decided to divide the SE category into images showing sexual activity or sexual posing. The not-sexual group can contain hard samples as images showing neutral nudity and soft erotics.
Figure 3: Images from COCO dataset coco2015 incorrectly classified as SE by the first version of a model trained only on DN-A (SEN-EM). A small number of neutral samples makes the model focus on images of people eating or keeping something near their mouths or images of close-up images of hands. False positives have been eliminated after adding a pretraining phase on extended datasets.

Detecting sexually explicit content in the context of the child sexual abuse materials (CSAM): end-to-end classifiers and region-based networks

TL;DR

Abstract

Detecting sexually explicit content in the context of the child sexual abuse materials (CSAM): end-to-end classifiers and region-based networks

Authors

TL;DR

Abstract

Table of Contents

Figures (3)