Generalized Source Tracing: Detecting Novel Audio Deepfake Algorithm with Real Emphasis and Fake Dispersion Strategy
Yuankun Xie, Ruibo Fu, Zhengqi Wen, Zhiyong Wang, Xiaopeng Wang, Haonnan Cheng, Long Ye, Jianhua Tao
TL;DR
This paper tackles the challenge of attributing audio deepfakes and identifying novel, out-of-distribution (OOD) deepfake algorithms. It introduces a dual-stage framework, Real Emphasis and Fake Dispersion (REFD), combined with a Novel Similarity Detection (NSD) OOD detector. Real Emphasis uses OC-Softmax to establish a tight real boundary, while Fake Dispersion employs RegMixup to produce soft logits and improve OOD sensitivity; NSD leverages feature-logit similarity to detect unknown methods. Evaluated on ADD2023T3, REFD achieves a single-system F1 score of 86.83% and demonstrates strong OOD detection across detectors, indicating robust ADAR performance with effective separation of ID and unknown deepfake classes. The work advances practical audio forensics by providing a scalable, dual-stage approach with explicit OOD handling and a specialized OOD detector.
Abstract
With the proliferation of deepfake audio, there is an urgent need to investigate their attribution. Current source tracing methods can effectively distinguish in-distribution (ID) categories. However, the rapid evolution of deepfake algorithms poses a critical challenge in the accurate identification of out-of-distribution (OOD) novel deepfake algorithms. In this paper, we propose Real Emphasis and Fake Dispersion (REFD) strategy for audio deepfake algorithm recognition, demonstrating its effectiveness in discriminating ID samples while identifying OOD samples. For effective OOD detection, we first explore current post-hoc OOD methods and propose NSD, a novel OOD approach in identifying novel deepfake algorithms through the similarity consideration of both feature and logits scores. REFD achieves 86.83% F1-score as a single system in Audio Deepfake Detection Challenge 2023 Track3, showcasing its state-of-the-art performance.
