SAIDO: Generalizable Detection of AI-Generated Images via Scene-Aware and Importance-Guided Dynamic Optimization in Continual Learning
Yongkang Hu, Yu Cheng, Yushuo Zhang, Yuan Xie, Zhaoxia Yin
TL;DR
SAIDO tackles the generalization problem in AI-generated image detection under continual learning by introducing a Scene-Aware and Importance-Guided Dynamic Optimization framework. It combines a Scene-Aware Expert Module (SAEM) that uses Vision-Language Large Models to dynamically allocate scene-specific LoRA detectors with a Scene-Aware Prompt system, and an Importance-Guided Dynamic Optimization (IDOM) mechanism that performs neuron-level gradient projection guided by Fisher information to balance plasticity and stability. Empirical results show SAIDO significantly reduces error and forgetting compared to state-of-the-art methods, improves open-world detection accuracy, and demonstrates robustness to common image degradations. The work provides a scalable, data-replay-free solution for open-world AI-generated image detection with strong generalization across diverse scenes and evolving generative models.
Abstract
The widespread misuse of image generation technologies has raised security concerns, driving the development of AI-generated image detection methods. However, generalization has become a key challenge and open problem: existing approaches struggle to adapt to emerging generative methods and content types in real-world scenarios. To address this issue, we propose a Scene-Aware and Importance-Guided Dynamic Optimization detection framework with continual learning (SAIDO). Specifically, we design Scene-Awareness-Based Expert Module (SAEM) that dynamically identifies and incorporates new scenes using VLLMs. For each scene, independent expert modules are dynamically allocated, enabling the framework to capture scene-specific forgery features better and enhance cross-scene generalization. To mitigate catastrophic forgetting when learning from multiple image generative methods, we introduce Importance-Guided Dynamic Optimization Mechanism (IDOM), which optimizes each neuron through an importance-guided gradient projection strategy, thereby achieving an effective balance between model plasticity and stability. Extensive experiments on continual learning tasks demonstrate that our method outperforms the current SOTA method in both stability and plasticity, achieving 44.22\% and 40.57\% relative reductions in average detection error rate and forgetting rate, respectively. On open-world datasets, it improves the average detection accuracy by 9.47\% compared to the current SOTA method.
