Table of Contents
Fetching ...

Data-Independent Operator: A Training-Free Artifact Representation Extractor for Generalizable Deepfake Detection

Chuangchuang Tan, Ping Liu, RenShuai Tao, Huan Liu, Yao Zhao, Baoyuan Wu, Yunchao Wei

TL;DR

This work shows that the small and training-free filter is sufficient to capture more general artifact representations and defines the DIO as Data-Independent Operator (DIO) to achieve appealing improvements on unseen sources.

Abstract

Recently, the proliferation of increasingly realistic synthetic images generated by various generative adversarial networks has increased the risk of misuse. Consequently, there is a pressing need to develop a generalizable detector for accurately recognizing fake images. The conventional methods rely on generating diverse training sources or large pretrained models. In this work, we show that, on the contrary, the small and training-free filter is sufficient to capture more general artifact representations. Due to its unbias towards both the training and test sources, we define it as Data-Independent Operator (DIO) to achieve appealing improvements on unseen sources. In our framework, handcrafted filters and the randomly-initialized convolutional layer can be used as the training-free artifact representations extractor with excellent results. With the data-independent operator of a popular classifier, such as Resnet50, one could already reach a new state-of-the-art without bells and whistles. We evaluate the effectiveness of the DIO on 33 generation models, even DALLE and Midjourney. Our detector achieves a remarkable improvement of $13.3\%$, establishing a new state-of-the-art performance. The DIO and its extension can serve as strong baselines for future methods. The code is available at \url{https://github.com/chuangchuangtan/Data-Independent-Operator}.

Data-Independent Operator: A Training-Free Artifact Representation Extractor for Generalizable Deepfake Detection

TL;DR

This work shows that the small and training-free filter is sufficient to capture more general artifact representations and defines the DIO as Data-Independent Operator (DIO) to achieve appealing improvements on unseen sources.

Abstract

Recently, the proliferation of increasingly realistic synthetic images generated by various generative adversarial networks has increased the risk of misuse. Consequently, there is a pressing need to develop a generalizable detector for accurately recognizing fake images. The conventional methods rely on generating diverse training sources or large pretrained models. In this work, we show that, on the contrary, the small and training-free filter is sufficient to capture more general artifact representations. Due to its unbias towards both the training and test sources, we define it as Data-Independent Operator (DIO) to achieve appealing improvements on unseen sources. In our framework, handcrafted filters and the randomly-initialized convolutional layer can be used as the training-free artifact representations extractor with excellent results. With the data-independent operator of a popular classifier, such as Resnet50, one could already reach a new state-of-the-art without bells and whistles. We evaluate the effectiveness of the DIO on 33 generation models, even DALLE and Midjourney. Our detector achieves a remarkable improvement of , establishing a new state-of-the-art performance. The DIO and its extension can serve as strong baselines for future methods. The code is available at \url{https://github.com/chuangchuangtan/Data-Independent-Operator}.
Paper Structure (27 sections, 4 equations, 3 figures, 11 tables)

This paper contains 27 sections, 4 equations, 3 figures, 11 tables.

Figures (3)

  • Figure 1: In order to enhance the generalization ability of detectors, we employ data-independent operators (DIO) - fixed filters that are unbiased treatment of both known and unseen data - for extracting more general artifact representations. Our DIO effectively suppresses the content of images, compelling the detectors to focus on identifying artifact clues for detecting forged images. Through the incorporation of these filters, the detector trained on a single source demonstrates enhanced performance when tested on 33 previously unseen sources.
  • Figure 2: The pipeline of the simple and effective DIO and MDIO. In order to enhance the generalization ability of detectors, we utilize data-independent operators - fixed filters that are unbiased treatments of both known and unseen data - to extract more general artifact representations. With the data-independent operator of a classifier, one could already reach a new SOTA without bells and whistles.
  • Figure 3: The t-SNE van2014accelerating visualization of features extracted from the classifier. The blue and light blue point respect the feature of real and fake images, respectively. In the visualization, the fake images generated from 15 sources. It can be observed that the classifier trained on the ProGAN have ability to detect the cross-sources images, even diffusions. This observation explains that Our proposed DIO successfully reduces the distribution drift between different sources, thereby enhancing the generalization ability of the detectors.