DeMod: A Holistic Tool with Explainable Detection and Personalized Modification for Toxicity Censorship

Yaqiong Li; Peng Zhang; Hansu Gu; Tun Lu; Siyuan Qiao; Yubo Shu; Yiyang Shao; Ning Gu

DeMod: A Holistic Tool with Explainable Detection and Personalized Modification for Toxicity Censorship

Yaqiong Li, Peng Zhang, Hansu Gu, Tun Lu, Siyuan Qiao, Yubo Shu, Yiyang Shao, Ning Gu

TL;DR

This work tackles toxicity censorship as a multi-stage problem beyond detection, proposing DeMod, a ChatGPT-based holistic tool that delivers explainable detection and personalized content modification. Guided by a needfinding study on Weibo, DeMod integrates user authorization, fine-grained detection with immediate and dynamic explanations, and user-tailored revisions, demonstrating practical deployment and evaluation with 35 Weibo users. Automatic and human evaluations show strong detection performance (GPT-4 up to 73.5% accuracy) and substantial toxicity reduction (≈94%), with high user acceptance and clear design implications. The study provides actionable insights for designing end-to-end censorship systems that emphasize interpretability, user control, and multi-stage functionality, with broader implications for platforms, policymakers, and future research in multimedia and cross-platform contexts.

Abstract

Although there have been automated approaches and tools supporting toxicity censorship for social posts, most of them focus on detection. Toxicity censorship is a complex process, wherein detection is just an initial task and a user can have further needs such as rationale understanding and content modification. For this problem, we conduct a needfinding study to investigate people's diverse needs in toxicity censorship and then build a ChatGPT-based censorship tool named DeMod accordingly. DeMod is equipped with the features of explainable Detection and personalized Modification, providing fine-grained detection results, detailed explanations, and personalized modification suggestions. We also implemented the tool and recruited 35 Weibo users for evaluation. The results suggest DeMod's multiple strengths like the richness of functionality, the accuracy of censorship, and ease of use. Based on the findings, we further propose several insights into the design of content censorship systems.

DeMod: A Holistic Tool with Explainable Detection and Personalized Modification for Toxicity Censorship

TL;DR

Abstract

DeMod: A Holistic Tool with Explainable Detection and Personalized Modification for Toxicity Censorship

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (7)