Table of Contents
Fetching ...

Detecting sexually explicit content in the context of the child sexual abuse materials (CSAM): end-to-end classifiers and region-based networks

Weronika Gutfeter, Joanna Gajewska, Andrzej Pacut

TL;DR

Several approaches are explored to solve the task of classifying sexually explicit content, which plays a crucial role in the automated CSAM detection system: an end-to-end classifier, a classifier with person detection and a private body parts detector.

Abstract

Child sexual abuse materials (CSAM) pose a significant threat to the safety and well-being of children worldwide. Detecting and preventing the distribution of such materials is a critical task for law enforcement agencies and technology companies. As content moderation is often manual, developing an automated detection system can help reduce human reviewers' exposure to potentially harmful images and accelerate the process of counteracting. This study presents methods for classifying sexually explicit content, which plays a crucial role in the automated CSAM detection system. Several approaches are explored to solve the task: an end-to-end classifier, a classifier with person detection and a private body parts detector. All proposed methods are tested on the images obtained from the online tool for reporting illicit content. Due to legal constraints, access to the data is limited, and all algorithms are executed remotely on the isolated server. The end-to-end classifier yields the most promising results, with an accuracy of 90.17%, after augmenting the training set with the additional neutral samples and adult pornography. While detection-based methods may not achieve higher accuracy rates and cannot serve as a final classifier on their own, their inclusion in the system can be beneficial. Human body-oriented approaches generate results that are easier to interpret, and obtaining more interpretable results is essential when analyzing models that are trained without direct access to data.

Detecting sexually explicit content in the context of the child sexual abuse materials (CSAM): end-to-end classifiers and region-based networks

TL;DR

Several approaches are explored to solve the task of classifying sexually explicit content, which plays a crucial role in the automated CSAM detection system: an end-to-end classifier, a classifier with person detection and a private body parts detector.

Abstract

Child sexual abuse materials (CSAM) pose a significant threat to the safety and well-being of children worldwide. Detecting and preventing the distribution of such materials is a critical task for law enforcement agencies and technology companies. As content moderation is often manual, developing an automated detection system can help reduce human reviewers' exposure to potentially harmful images and accelerate the process of counteracting. This study presents methods for classifying sexually explicit content, which plays a crucial role in the automated CSAM detection system. Several approaches are explored to solve the task: an end-to-end classifier, a classifier with person detection and a private body parts detector. All proposed methods are tested on the images obtained from the online tool for reporting illicit content. Due to legal constraints, access to the data is limited, and all algorithms are executed remotely on the isolated server. The end-to-end classifier yields the most promising results, with an accuracy of 90.17%, after augmenting the training set with the additional neutral samples and adult pornography. While detection-based methods may not achieve higher accuracy rates and cannot serve as a final classifier on their own, their inclusion in the system can be beneficial. Human body-oriented approaches generate results that are easier to interpret, and obtaining more interpretable results is essential when analyzing models that are trained without direct access to data.
Paper Structure (9 sections, 1 equation, 3 figures, 6 tables)

This paper contains 9 sections, 1 equation, 3 figures, 6 tables.

Figures (3)

  • Figure 1: Decision schema that leads to the final classification in the proposed CSAM detection system. A complete system consists of two models: one performs age estimation and the second predicts whether the image contains sexually explicit elements (SE classification). Two-stage classification enables the system to distinguish between CSAM, adult pornography and neutral images.
  • Figure 2: Distinguishing between images that contain sexually explicit content (SE) and not-sexual ones (NS) is sufficient for the system to detect CSAM. However, we decided to divide the SE category into images showing sexual activity or sexual posing. The not-sexual group can contain hard samples as images showing neutral nudity and soft erotics.
  • Figure 3: Images from COCO dataset coco2015 incorrectly classified as SE by the first version of a model trained only on DN-A (SEN-EM). A small number of neutral samples makes the model focus on images of people eating or keeping something near their mouths or images of close-up images of hands. False positives have been eliminated after adding a pretraining phase on extended datasets.