Table of Contents
Fetching ...

Reinforced Multi-teacher Knowledge Distillation for Efficient General Image Forgery Detection and Localization

Zeqin Yu, Jiangqun Ni, Jian Zhang, Haoyi Deng, Yuzhen Lin

TL;DR

This work addresses the challenge of robust generalization in image forgery detection and localization across multiple tampering types. It introduces Re-MTKD, a multi-teacher KD framework built on a Cue-Net backbone with an Edge-Aware Module, and a Reinforced Dynamic Teacher Selection policy that adaptively weights specialized teachers during knowledge transfer. Empirical results across ten diverse datasets show state-of-the-art performance in both detection and localization, with particular strength on multi-tampering scenarios and favorable inference efficiency. The combination of type-specific teachers, dynamic selection, and edge-aware fusion offers a scalable and effective path toward practical, generalizable IFDL systems.

Abstract

Image forgery detection and localization (IFDL) is of vital importance as forged images can spread misinformation that poses potential threats to our daily lives. However, previous methods still struggled to effectively handle forged images processed with diverse forgery operations in real-world scenarios. In this paper, we propose a novel Reinforced Multi-teacher Knowledge Distillation (Re-MTKD) framework for the IFDL task, structured around an encoder-decoder \textbf{C}onvNeXt-\textbf{U}perNet along with \textbf{E}dge-Aware Module, named Cue-Net. First, three Cue-Net models are separately trained for the three main types of image forgeries, i.e., copy-move, splicing, and inpainting, which then serve as the multi-teacher models to train the target student model with Cue-Net through self-knowledge distillation. A Reinforced Dynamic Teacher Selection (Re-DTS) strategy is developed to dynamically assign weights to the involved teacher models, which facilitates specific knowledge transfer and enables the student model to effectively learn both the common and specific natures of diverse tampering traces. Extensive experiments demonstrate that, compared with other state-of-the-art methods, the proposed method achieves superior performance on several recently emerged datasets comprised of various kinds of image forgeries.

Reinforced Multi-teacher Knowledge Distillation for Efficient General Image Forgery Detection and Localization

TL;DR

This work addresses the challenge of robust generalization in image forgery detection and localization across multiple tampering types. It introduces Re-MTKD, a multi-teacher KD framework built on a Cue-Net backbone with an Edge-Aware Module, and a Reinforced Dynamic Teacher Selection policy that adaptively weights specialized teachers during knowledge transfer. Empirical results across ten diverse datasets show state-of-the-art performance in both detection and localization, with particular strength on multi-tampering scenarios and favorable inference efficiency. The combination of type-specific teachers, dynamic selection, and edge-aware fusion offers a scalable and effective path toward practical, generalizable IFDL systems.

Abstract

Image forgery detection and localization (IFDL) is of vital importance as forged images can spread misinformation that poses potential threats to our daily lives. However, previous methods still struggled to effectively handle forged images processed with diverse forgery operations in real-world scenarios. In this paper, we propose a novel Reinforced Multi-teacher Knowledge Distillation (Re-MTKD) framework for the IFDL task, structured around an encoder-decoder \textbf{C}onvNeXt-\textbf{U}perNet along with \textbf{E}dge-Aware Module, named Cue-Net. First, three Cue-Net models are separately trained for the three main types of image forgeries, i.e., copy-move, splicing, and inpainting, which then serve as the multi-teacher models to train the target student model with Cue-Net through self-knowledge distillation. A Reinforced Dynamic Teacher Selection (Re-DTS) strategy is developed to dynamically assign weights to the involved teacher models, which facilitates specific knowledge transfer and enables the student model to effectively learn both the common and specific natures of diverse tampering traces. Extensive experiments demonstrate that, compared with other state-of-the-art methods, the proposed method achieves superior performance on several recently emerged datasets comprised of various kinds of image forgeries.

Paper Structure

This paper contains 22 sections, 11 equations, 8 figures, 7 tables, 2 algorithms.

Figures (8)

  • Figure 1: Overview of the existing IFDL methods. Specific IFDL methods are often limited by inefficient models, leading to poor generalization across tampering operations (the first part). Generic IFDL methods are inefficient in exploiting data and difficult to learn both the common and specific tampered features in mixed tamper data (the second part). Our proposed method can achieve promising performance in comprehensive IFDL problems.
  • Figure 2: We propose a novel Reinforced Multi-teacher Knowledge Distillation framework, structured by the simple yet effective Cue-Net backbone, for the IFDL task. Within this framework, the proposed Re-DTS strategy dynamically selects teacher models based on different tampering types of data, guiding the student model to effectively learn various tampering traces.
  • Figure 3: Feature space visualization of different KD strategies on CASIA v1+ and DiverseInp.
  • Figure 4: Four examples are copy-move (copying and moving an object within the target image), splicing (pasting the object from the source image to the target image), inpainting (erasing the object from the target image) and multi-tampering (possible combinations of the above three tampering operations), respectively. In column 3, we emphasize the tampered regions of copy-move, splicing and inpainting using blue, green and red edges, respectively. In this case, the multi-tampered image contains both copy-move and inpainting tampering combinations, which presents a significant challenge to the IFDL task.
  • Figure 5: Robustness against JPEG compression, Gaussian blur, Gaussian noise and Median filtering effects.
  • ...and 3 more figures