Table of Contents
Fetching ...

ADD 2023: Towards Audio Deepfake Detection and Analysis in the Wild

Jiangyan Yi, Chu Yuan Zhang, Jianhua Tao, Chenglong Wang, Xinrui Yan, Yong Ren, Hao Gu, Junzuo Zhou

TL;DR

ADD 2023 extends audio deepfake research beyond binary detection by introducing manipulation-region localization and source-attribution tasks, backed by a four-track dataset designed to mimic real-world conditions. The paper analyzes top-performing methods across FG-G, FG-D, RL, and AR tracks, highlighting common use of Wav2Vec 2.0 embeddings, AASIST back-ends, and vocoder- and TTS-based generation strategies, while noting limitations in unseen attacks, generalization, and interpretability. Key contributions include detailed dataset descriptions per track, technical analyses of prevailing approaches, and a roadmap emphasizing real-time processing, multilinguality, and standardized evaluation benchmarks. The work has practical impact for forensics, law enforcement, and media integrity, providing datasets and insights to build more robust, trustworthy anti-deepfake systems.

Abstract

The growing prominence of the field of audio deepfake detection is driven by its wide range of applications, notably in protecting the public from potential fraud and other malicious activities, prompting the need for greater attention and research in this area. The ADD 2023 challenge goes beyond binary real/fake classification by emulating real-world scenarios, such as the identification of manipulated intervals in partially fake audio and determining the source responsible for generating any fake audio, both with real-life implications, notably in audio forensics, law enforcement, and construction of reliable and trustworthy evidence. To further foster research in this area, in this article, we describe the dataset that was used in the fake game, manipulation region location and deepfake algorithm recognition tracks of the challenge. We also focus on the analysis of the technical methodologies by the top-performing participants in each task and note the commonalities and differences in their approaches. Finally, we discuss the current technical limitations as identified through the technical analysis, and provide a roadmap for future research directions. The dataset is available for download at http://addchallenge.cn/downloadADD2023.

ADD 2023: Towards Audio Deepfake Detection and Analysis in the Wild

TL;DR

ADD 2023 extends audio deepfake research beyond binary detection by introducing manipulation-region localization and source-attribution tasks, backed by a four-track dataset designed to mimic real-world conditions. The paper analyzes top-performing methods across FG-G, FG-D, RL, and AR tracks, highlighting common use of Wav2Vec 2.0 embeddings, AASIST back-ends, and vocoder- and TTS-based generation strategies, while noting limitations in unseen attacks, generalization, and interpretability. Key contributions include detailed dataset descriptions per track, technical analyses of prevailing approaches, and a roadmap emphasizing real-time processing, multilinguality, and standardized evaluation benchmarks. The work has practical impact for forensics, law enforcement, and media integrity, providing datasets and insights to build more robust, trustworthy anti-deepfake systems.

Abstract

The growing prominence of the field of audio deepfake detection is driven by its wide range of applications, notably in protecting the public from potential fraud and other malicious activities, prompting the need for greater attention and research in this area. The ADD 2023 challenge goes beyond binary real/fake classification by emulating real-world scenarios, such as the identification of manipulated intervals in partially fake audio and determining the source responsible for generating any fake audio, both with real-life implications, notably in audio forensics, law enforcement, and construction of reliable and trustworthy evidence. To further foster research in this area, in this article, we describe the dataset that was used in the fake game, manipulation region location and deepfake algorithm recognition tracks of the challenge. We also focus on the analysis of the technical methodologies by the top-performing participants in each task and note the commonalities and differences in their approaches. Finally, we discuss the current technical limitations as identified through the technical analysis, and provide a roadmap for future research directions. The dataset is available for download at http://addchallenge.cn/downloadADD2023.
Paper Structure (23 sections, 1 figure, 8 tables)