ForensicsForest Family: A Series of Multi-scale Hierarchical Cascade Forests for Detecting GAN-generated Faces
Jiucui Lu, Jiaran Zhou, Junyu Dong, Bin Li, Siwei Lyu, Yuezun Li
TL;DR
This work tackles the rising realism of GAN-generated faces and the limitations of CNN-based detectors, notably high compute demands and vulnerability to adversarial manipulation. It introduces ForensicsForest Family, a set of forest-based detectors built on a Multi-scale Hierarchical Cascade Forest that processes semantic, frequency, and biological facial cues through a patch-wise, multi-scale pipeline. The three variants—ForensicsForest, Hybrid ForensicsForest, and Divide-and-Conquer ForensicsForest—offer a spectrum of capabilities from pure forest architecture to hybrid learnable refinements and memory-efficient training. Across StyleGAN, StyleGAN2, StyleGAN3, and other generators, the method achieves strong detection performance, demonstrates CPU-friendly training, and shows robustness to common perturbations, indicating practical viability for GAN-forensics and deployments beyond CNN-based approaches.
Abstract
The prominent progress in generative models has significantly improved the reality of generated faces, bringing serious concerns to society. Since recent GAN-generated faces are in high realism, the forgery traces have become more imperceptible, increasing the forensics challenge. To combat GAN-generated faces, many countermeasures based on Convolutional Neural Networks (CNNs) have been spawned due to their strong learning ability. In this paper, we rethink this problem and explore a new approach based on forest models instead of CNNs. Specifically, we describe a simple and effective forest-based method set called {\em ForensicsForest Family} to detect GAN-generate faces. The proposed ForensicsForest family is composed of three variants, which are {\em ForensicsForest}, {\em Hybrid ForensicsForest} and {\em Divide-and-Conquer ForensicsForest} respectively. ForenscisForest is a newly proposed Multi-scale Hierarchical Cascade Forest, which takes semantic, frequency and biology features as input, hierarchically cascades different levels of features for authenticity prediction, and then employs a multi-scale ensemble scheme that can comprehensively consider different levels of information to improve the performance further. Based on ForensicsForest, we develop Hybrid ForensicsForest, an extended version that integrates the CNN layers into models, to further refine the effectiveness of augmented features. Moreover, to reduce the memory cost in training, we propose Divide-and-Conquer ForensicsForest, which can construct a forest model using only a portion of training samplings. In the training stage, we train several candidate forest models using the subsets of training samples. Then a ForensicsForest is assembled by picking the suitable components from these candidate forest models...
