Beyond Spectral Peaks: Interpreting the Cues Behind Synthetic Image Detection
Sara Mandelli, Diego Vila-Portela, David Vázquez-Padín, Paolo Bestagini, Fernando Pérez-González
TL;DR
This paper questions the assumption that frequency-domain spectral peaks are central cues used by deep learning detectors for synthetic image detection. It introduces peak-removal experiments in the Fourier domain and a simple linear, peak-based baseline to explicitly test detector reliance on these artifacts. The results show that most state-of-the-art detectors are largely not dependent on spectral peaks, while a straightforward peak-based detector achieves high accuracy, highlighting the value of interpretable methods. The findings motivate hybrid approaches that combine the transparency of linear methods with the power of deep learning for more trustworthy forensic tools.
Abstract
Over the years, the forensics community has proposed several deep learning-based detectors to mitigate the risks of generative AI. Recently, frequency-domain artifacts (particularly periodic peaks in the magnitude spectrum), have received significant attention, as they have been often considered a strong indicator of synthetic image generation. However, state-of-the-art detectors are typically used as black-boxes, and it still remains unclear whether they truly rely on these peaks. This limits their interpretability and trust. In this work, we conduct a systematic study to address this question. We propose a strategy to remove spectral peaks from images and analyze the impact of this operation on several detectors. In addition, we introduce a simple linear detector that relies exclusively on frequency peaks, providing a fully interpretable baseline free from the confounding influence of deep learning. Our findings reveal that most detectors are not fundamentally dependent on spectral peaks, challenging a widespread assumption in the field and paving the way for more transparent and reliable forensic tools.
