MSN: Multi-directional Similarity Network for Hand-crafted and Deep-synthesized Copy-Move Forgery Detection
Liangwei Jiang, Jinluo Xie, Yecheng Huang, Hua Zhang, Hongyu Yang, Di Huang
TL;DR
This paper tackles the growing challenge of copy-move forgery detection under rotations, scaling, and deep-synthesized tampering. It introduces the Multi-directional Similarity Network (MSN), a two-stream framework that combines a multi-directional, multi-scale representation with a 2-D similarity matrix decoder to enhance region localization. MSN achieves state-of-the-art results on classic benchmarks CASIA CMFD and CoMoFoD and demonstrates strong robustness on the newly proposed deep-synthesized forgery dataset (DCF), including improvements from synthetic-data fine-tuning. The work also provides extensive ablation analyses and a fast inference time, underscoring both effectiveness and practicality for real-world CMFD tasks in the era of deepfake-like manipulations.
Abstract
Copy-move image forgery aims to duplicate certain objects or to hide specific contents with copy-move operations, which can be achieved by a sequence of manual manipulations as well as up-to-date deep generative network-based swapping. Its detection is becoming increasingly challenging for the complex transformations and fine-tuned operations on the tampered regions. In this paper, we propose a novel two-stream model, namely Multi-directional Similarity Network (MSN), to accurate and efficient copy-move forgery detection. It addresses the two major limitations of existing deep detection models in \textbf{representation} and \textbf{localization}, respectively. In representation, an image is hierarchically encoded by a multi-directional CNN network, and due to the diverse augmentation in scales and rotations, the feature achieved better measures the similarity between sampled patches in two streams. In localization, we design a 2-D similarity matrix based decoder, and compared with the current 1-D similarity vector based one, it makes full use of spatial information in the entire image, leading to the improvement in detecting tampered regions. Beyond the method, a new forgery database generated by various deep neural networks is presented, as a new benchmark for detecting the growing deep-synthesized copy-move. Extensive experiments are conducted on two classic image forensics benchmarks, \emph{i.e.} CASIA CMFD and CoMoFoD, and the newly presented one. The state-of-the-art results are reported, which demonstrate the effectiveness of the proposed approach.
