Exploring Depth Information for Detecting Manipulated Face Videos

Haoyue Wang; Sheng Li; Ji He; Zhenxing Qian; Xinpeng Zhang; Shaolin Fan

Exploring Depth Information for Detecting Manipulated Face Videos

Haoyue Wang, Sheng Li, Ji He, Zhenxing Qian, Xinpeng Zhang, Shaolin Fan

TL;DR

This paper proposes a Face Depth Map Transformer to estimate the face depth map patch by patch from an RGB face image, which is able to capture the local depth anomaly created due to manipulation, and proposes an RGB-Depth Inconsistency Attention (RDIA) module to effectively capture the inter-frame inconsistency for multi-frame input.

Abstract

Face manipulation detection has been receiving a lot of attention for the reliability and security of the face images/videos. Recent studies focus on using auxiliary information or prior knowledge to capture robust manipulation traces, which are shown to be promising. As one of the important face features, the face depth map, which has shown to be effective in other areas such as face recognition or face detection, is unfortunately paid little attention to in literature for face manipulation detection. In this paper, we explore the possibility of incorporating the face depth map as auxiliary information for robust face manipulation detection. To this end, we first propose a Face Depth Map Transformer (FDMT) to estimate the face depth map patch by patch from an RGB face image, which is able to capture the local depth anomaly created due to manipulation. The estimated face depth map is then considered as auxiliary information to be integrated with the backbone features using a Multi-head Depth Attention (MDA) mechanism that is newly designed. We also propose an RGB-Depth Inconsistency Attention (RDIA) module to effectively capture the inter-frame inconsistency for multi-frame input. Various experiments demonstrate the advantage of our proposed method for face manipulation detection.

Exploring Depth Information for Detecting Manipulated Face Videos

TL;DR

Abstract

Exploring Depth Information for Detecting Manipulated Face Videos

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (10)