Table of Contents
Fetching ...

ICM-Assistant: Instruction-tuning Multimodal Large Language Models for Rule-based Explainable Image Content Moderation

Mengyang Wu, Yuzhi Zhao, Jialun Cao, Mingjie Xu, Zhongming Jiang, Xuehui Wang, Qinbin Li, Guangneng Hu, Shengchao Qin, Chi-Wing Fu

TL;DR

This work addresses the need for flexible, explainable image content moderation that complies with diverse cultural and child-protection rules. It introduces a rule-based data generation pipeline to decompose moderation rules into attribute products, enrich explanations, and generate moderation Q-A, culminating in the ICM-Instruct dataset. By instruction-tuning open-source Multimodal LLMs with this data, the authors create ICM-Assistant models that outperform baselines in classification accuracy and explanation quality across diverse data sources, with notable zero-shot generalization to unseen terms. The framework enables precise, rule-aligned moderation with transparent reasoning and Q-A capabilities, offering practical impact for platforms needing configurable moderation and explanations.

Abstract

Controversial contents largely inundate the Internet, infringing various cultural norms and child protection standards. Traditional Image Content Moderation (ICM) models fall short in producing precise moderation decisions for diverse standards, while recent multimodal large language models (MLLMs), when adopted to general rule-based ICM, often produce classification and explanation results that are inconsistent with human moderators. Aiming at flexible, explainable, and accurate ICM, we design a novel rule-based dataset generation pipeline, decomposing concise human-defined rules and leveraging well-designed multi-stage prompts to enrich short explicit image annotations. Our ICM-Instruct dataset includes detailed moderation explanation and moderation Q-A pairs. Built upon it, we create our ICM-Assistant model in the framework of rule-based ICM, making it readily applicable in real practice. Our ICM-Assistant model demonstrates exceptional performance and flexibility. Specifically, it significantly outperforms existing approaches on various sources, improving both the moderation classification (36.8% on average) and moderation explanation quality (26.6% on average) consistently over existing MLLMs. Code/Data is available at https://github.com/zhaoyuzhi/ICM-Assistant.

ICM-Assistant: Instruction-tuning Multimodal Large Language Models for Rule-based Explainable Image Content Moderation

TL;DR

This work addresses the need for flexible, explainable image content moderation that complies with diverse cultural and child-protection rules. It introduces a rule-based data generation pipeline to decompose moderation rules into attribute products, enrich explanations, and generate moderation Q-A, culminating in the ICM-Instruct dataset. By instruction-tuning open-source Multimodal LLMs with this data, the authors create ICM-Assistant models that outperform baselines in classification accuracy and explanation quality across diverse data sources, with notable zero-shot generalization to unseen terms. The framework enables precise, rule-aligned moderation with transparent reasoning and Q-A capabilities, offering practical impact for platforms needing configurable moderation and explanations.

Abstract

Controversial contents largely inundate the Internet, infringing various cultural norms and child protection standards. Traditional Image Content Moderation (ICM) models fall short in producing precise moderation decisions for diverse standards, while recent multimodal large language models (MLLMs), when adopted to general rule-based ICM, often produce classification and explanation results that are inconsistent with human moderators. Aiming at flexible, explainable, and accurate ICM, we design a novel rule-based dataset generation pipeline, decomposing concise human-defined rules and leveraging well-designed multi-stage prompts to enrich short explicit image annotations. Our ICM-Instruct dataset includes detailed moderation explanation and moderation Q-A pairs. Built upon it, we create our ICM-Assistant model in the framework of rule-based ICM, making it readily applicable in real practice. Our ICM-Assistant model demonstrates exceptional performance and flexibility. Specifically, it significantly outperforms existing approaches on various sources, improving both the moderation classification (36.8% on average) and moderation explanation quality (26.6% on average) consistently over existing MLLMs. Code/Data is available at https://github.com/zhaoyuzhi/ICM-Assistant.

Paper Structure

This paper contains 21 sections, 2 equations, 6 figures, 8 tables.

Figures (6)

  • Figure 1: Overall framework: With input image and rules (on cultural norm and children protection), our method can (a) flexibly align with human moderators with four rules, and provide explainable results, overcoming the (b) classification and explanation inconsistency and achieving (c) more accurate moderation classification and explanations than the baseline MLLMs.
  • Figure 2: The overall pipeline from a specific set of moderation rules to a rule-based ICM-Assistant model. The first row illustrates rule decomposition, image downloading, and initialization of explicit descriptions. The second row presents the fully automatic data augmentation pipeline for ICM-Instruct dataset and the instruction-tuning process for ICM-Assistant models.
  • Figure 3: Pipeline for generating moderation explanations and Q-A pairs, from one attribute products "lower-leg".
  • Figure 4: Illustration of the process for building ICM-Test dataset, including ICM-Test-UGC (left) and ICM-Test-AIGC (right).
  • Figure 5: Comparison of ICM results with two different rules, R1 and R2. (See rule differences in Sec: ICM-Instruct dataset.)
  • ...and 1 more figures