Detecting Malicious Concepts Without Image Generation in AIGC
Kun Xu, Yushu Zhang, Shuren Qi, Tao Wang, Wenying Wen, Yuming Fang
TL;DR
This work defines malicious concepts in the context of AIGC concept sharing and proposes Concept QuickLook, a detection framework that operates solely on concept files to flag potential harm without generating any images. It introduces two detection modes—concept matching and fuzzy detection—to handle both known concepts and unknown concept-class membership, leveraging embedding-space mappings between concept vectors and their classes. Through extensive experiments on SD1.5/SD2.0 pipelines, the approach demonstrates high accuracy, favorable user-focused scoring, and robustness to embedding-vector counts and SD-version differences, all while avoiding the computational costs of image generation. The framework aims to protect platforms and users from malicious or mismatched concepts and offers a practical, scalable direction for proactive security in open concept-sharing ecosystems.
Abstract
The task of text-to-image generation has achieved tremendous success in practice, with emerging concept generation models capable of producing highly personalized and customized content. Fervor for concept generation is increasing rapidly among users, and platforms for concept sharing have sprung up. The concept owners may upload malicious concepts and disguise them with non-malicious text descriptions and example images to deceive users into downloading and generating malicious content. The platform needs a quick method to determine whether a concept is malicious to prevent the spread of malicious concepts. However, simply relying on concept image generation to judge whether a concept is malicious requires time and computational resources. Especially, as the number of concepts uploaded and downloaded on the platform continues to increase, this approach becomes impractical and poses a risk of generating malicious content. In this paper, we propose Concept QuickLook, the first systematic work to incorporate malicious concept detection into research, which performs detection based solely on concept files without generating any images. We define malicious concepts and design two work modes for detection: concept matching and fuzzy detection. Extensive experiments demonstrate that the proposed Concept QuickLook can detect malicious concepts and demonstrate practicality in concept sharing platforms. We also design robustness experiments to further validate the effectiveness of the solution. We hope this work can initiate malicious concept detection tasks and provide some inspiration.
