A Framework for Portrait Stylization with Skin-Tone Awareness and Nudity Identification
Seungkwon Kim, Sangyeon Kim, Seung-Hun Nam
TL;DR
This work tackles practical portrait stylization by addressing skin-tone fidelity and explicit-content filtering in production settings. It introduces STAPSM, a skin-tone-aware stylization module using LoRA with skin-tone spectrum augmentation and a progressive edge-then-depth ControlNet inference, and NCIM, a nudity content identification suite that combines CLIP-based filtering with BLIP caption-based keyword matching. The proposed system demonstrates superior skin-tone preservation, improved nudity detection reliability, and successful real-world deployment as TOON-FILTER, handling over 2 million images with no reported incidents. Together, these contributions enable high-quality, safer portrait stylization suitable for enterprise Webtoon IP applications.
Abstract
Portrait stylization is a challenging task involving the transformation of an input portrait image into a specific style while preserving its inherent characteristics. The recent introduction of Stable Diffusion (SD) has significantly improved the quality of outcomes in this field. However, a practical stylization framework that can effectively filter harmful input content and preserve the distinct characteristics of an input, such as skin-tone, while maintaining the quality of stylization remains lacking. These challenges have hindered the wide deployment of such a framework. To address these issues, this study proposes a portrait stylization framework that incorporates a nudity content identification module (NCIM) and a skin-tone-aware portrait stylization module (STAPSM). In experiments, NCIM showed good performance in enhancing explicit content filtering, and STAPSM accurately represented a diverse range of skin tones. Our proposed framework has been successfully deployed in practice, and it has effectively satisfied critical requirements of real-world applications.
