Table of Contents
Fetching ...

Mining Generalized Features for Detecting AI-Manipulated Fake Faces

Yang Yu, Rongrong Ni, Yao Zhao

TL;DR

This work proposes a novel framework that focuses on mining intrinsic features and further eliminating the distribution bias among cross-manipulation techniques to improve the generalization ability of the proposed framework.

Abstract

Recently, AI-manipulated face techniques have developed rapidly and constantly, which has raised new security issues in society. Although existing detection methods consider different categories of fake faces, the performance on detecting the fake faces with "unseen" manipulation techniques is still poor due to the distribution bias among cross-manipulation techniques. To solve this problem, we propose a novel framework that focuses on mining intrinsic features and further eliminating the distribution bias to improve the generalization ability. Firstly, we focus on mining the intrinsic clues in the channel difference image (CDI) and spectrum image (SI) from the camera imaging process and the indispensable step in AI manipulation process. Then, we introduce the Octave Convolution (OctConv) and an attention-based fusion module to effectively and adaptively mine intrinsic features from CDI and SI. Finally, we design an alignment module to eliminate the bias of manipulation techniques to obtain a more generalized detection framework. We evaluate the proposed framework on four categories of fake faces datasets with the most popular and state-of-the-art manipulation techniques, and achieve very competitive performances. To further verify the generalization ability of the proposed framework, we conduct experiments on cross-manipulation techniques, and the results show the advantages of our method.

Mining Generalized Features for Detecting AI-Manipulated Fake Faces

TL;DR

This work proposes a novel framework that focuses on mining intrinsic features and further eliminating the distribution bias among cross-manipulation techniques to improve the generalization ability of the proposed framework.

Abstract

Recently, AI-manipulated face techniques have developed rapidly and constantly, which has raised new security issues in society. Although existing detection methods consider different categories of fake faces, the performance on detecting the fake faces with "unseen" manipulation techniques is still poor due to the distribution bias among cross-manipulation techniques. To solve this problem, we propose a novel framework that focuses on mining intrinsic features and further eliminating the distribution bias to improve the generalization ability. Firstly, we focus on mining the intrinsic clues in the channel difference image (CDI) and spectrum image (SI) from the camera imaging process and the indispensable step in AI manipulation process. Then, we introduce the Octave Convolution (OctConv) and an attention-based fusion module to effectively and adaptively mine intrinsic features from CDI and SI. Finally, we design an alignment module to eliminate the bias of manipulation techniques to obtain a more generalized detection framework. We evaluate the proposed framework on four categories of fake faces datasets with the most popular and state-of-the-art manipulation techniques, and achieve very competitive performances. To further verify the generalization ability of the proposed framework, we conduct experiments on cross-manipulation techniques, and the results show the advantages of our method.

Paper Structure

This paper contains 23 sections, 11 equations, 9 figures, 9 tables.

Figures (9)

  • Figure 1: Four categories of fake faces (bottom) with various AI-manipulated techniques and real faces (top). (a) Entire Synthesis. (b) Expression Manipulation. (c) Attribute Manipulation. (d) Identity Manipulation.
  • Figure 2: The previous detecors cannot perform well on the fake faces produced by unseen manipulation techniques.
  • Figure 3: Overview of the proposed framework for mining generalized features to detect AI-manipulated fake faces. This framework consists of three phases: First, the feature learning module accepts CDI and SI as inputs to mine intrinsic features with OctResNet-34. Then, the feature attention fusion module is adopted to adaptively fuse these two features. Finally, the domain alignment module is specially proposed to reduce the bias of manipulation techniques to by minimizing the difference in feature distribution among cross-manipulation techniques, thereby further improving the generalization ability of our framework on detecting fake faces with unseen manipulation techniques.
  • Figure 4: The real face images and the corresponding $R-G$ CDIs (left), and four categories of fake faces with different manipulation techniques and the corresponding $R-G$ CDIs (right). (a) Entire Synthesis. (b) Expression Manipulation. (c) Attribute Manipulation. (d) Identity Manipulation.
  • Figure 5: Average spectrum images (SI) of four categories of fake faces (bottom) with different manipulation techniques and corresponding real faces (top). (a) Entire Synthesis. (b) Expression Manipulation. (c) Attribute Manipulation. (d) Identity Manipulation.
  • ...and 4 more figures