AnimeDL-2M: Million-Scale AI-Generated Anime Image Detection and Localization in Diffusion Era
Chenyang Zhu, Xing Zhang, Yuyang Sun, Ching-Chun Chang, Isao Echizen
TL;DR
AnimeDL-2M tackles the lack of anime-focused IMDL benchmarks by introducing a million-scale dataset with real, edited, and AI-generated anime images and rich annotations. It proposes AniXplore, a domain-tailored IMDL model that fuses texture- and semantics-based features to improve localization and detection in anime imagery. Across extensive experiments, AniXplore outperforms six SOTA methods and demonstrates strong generalization in detection, with ablations highlighting the value of frequency features, dual-branch fusion, and adaptive loss balancing. The dataset and model jointly provide a practical resource for copyright protection and content moderation in AI-generated anime content, and they set the stage for future domain-specific research in AI-forensics of stylized media.
Abstract
Recent advances in image generation, particularly diffusion models, have significantly lowered the barrier for creating sophisticated forgeries, making image manipulation detection and localization (IMDL) increasingly challenging. While prior work in IMDL has focused largely on natural images, the anime domain remains underexplored-despite its growing vulnerability to AI-generated forgeries. Misrepresentations of AI-generated images as hand-drawn artwork, copyright violations, and inappropriate content modifications pose serious threats to the anime community and industry. To address this gap, we propose AnimeDL-2M, the first large-scale benchmark for anime IMDL with comprehensive annotations. It comprises over two million images including real, partially manipulated, and fully AI-generated samples. Experiments indicate that models trained on existing IMDL datasets of natural images perform poorly when applied to anime images, highlighting a clear domain gap between anime and natural images. To better handle IMDL tasks in anime domain, we further propose AniXplore, a novel model tailored to the visual characteristics of anime imagery. Extensive evaluations demonstrate that AniXplore achieves superior performance compared to existing methods. Dataset and code can be found in https://flytweety.github.io/AnimeDL2M/.
