WMamba: Wavelet-based Mamba for Face Forgery Detection
Siran Peng, Tianshuo Zhang, Li Gao, Xiangyu Zhu, Haoyuan Zhang, Kai Pang, Zhen Lei
TL;DR
WMamba introduces a wavelet-based face forgery detector built on the Mamba framework, combining Dynamic Contour Convolution (DCConv) with a VMamba backbone to exploit slender facial contours and long-range spatial dependencies. The Hierarchical Wavelet Feature Extraction Branch (HWFEB) provides multi-level Haar DWT representations and DCConv-guided spatial attention, which are integrated into VMamba via spatial gating. Extensive cross-dataset and cross-manipulation experiments demonstrate state-of-the-art generalization and robustness, with ablations confirming the contributions of HWFEB, DCConv, and VMamba. The approach delivers accurate, efficient forgery detection from small patches and has strong practical impact for real-world anti-fraud and misinformation mitigation.
Abstract
The rapid evolution of deepfake generation technologies necessitates the development of robust face forgery detection algorithms. Recent studies have demonstrated that wavelet analysis can enhance the generalization abilities of forgery detectors. Wavelets effectively capture key facial contours, often slender, fine-grained, and globally distributed, that may conceal subtle forgery artifacts imperceptible in the spatial domain. However, current wavelet-based approaches fail to fully exploit the distinctive properties of wavelet data, resulting in sub-optimal feature extraction and limited performance gains. To address this challenge, we introduce WMamba, a novel wavelet-based feature extractor built upon the Mamba architecture. WMamba maximizes the utility of wavelet information through two key innovations. First, we propose Dynamic Contour Convolution (DCConv), which employs specially crafted deformable kernels to adaptively model slender facial contours. Second, by leveraging the Mamba architecture, our method captures long-range spatial relationships with linear complexity. This efficiency allows for the extraction of fine-grained, globally distributed forgery artifacts from small image patches. Extensive experiments show that WMamba achieves state-of-the-art (SOTA) performance, highlighting its effectiveness in face forgery detection.
