To what extent can ASV systems naturally defend against spoofing attacks?

Jee-weon Jung; Xin Wang; Nicholas Evans; Shinji Watanabe; Hye-jin Shim; Hemlata Tak; Sidhhant Arora; Junichi Yamagishi; Joon Son Chung

To what extent can ASV systems naturally defend against spoofing attacks?

Jee-weon Jung, Xin Wang, Nicholas Evans, Shinji Watanabe, Hye-jin Shim, Hemlata Tak, Sidhhant Arora, Junichi Yamagishi, Joon Son Chung

TL;DR

It is demonstrated that the evolution of ASV inherently incorporates defense mechanisms against spoofing attacks, and the advancement of spoofing attacks far outpaces that of ASV systems, hence necessitating further research on spoofing-robust ASV methodologies.

Abstract

The current automatic speaker verification (ASV) task involves making binary decisions on two types of trials: target and non-target. However, emerging advancements in speech generation technology pose significant threats to the reliability of ASV systems. This study investigates whether ASV effortlessly acquires robustness against spoofing attacks (i.e., zero-shot capability) by systematically exploring diverse ASV systems and spoofing attacks, ranging from traditional to cutting-edge techniques. Through extensive analyses conducted on eight distinct ASV systems and 29 spoofing attack systems, we demonstrate that the evolution of ASV inherently incorporates defense mechanisms against spoofing attacks. Nevertheless, our findings also underscore that the advancement of spoofing attacks far outpaces that of ASV systems, hence necessitating further research on spoofing-robust ASV methodologies.

To what extent can ASV systems naturally defend against spoofing attacks?

TL;DR

Abstract

Paper Structure (16 sections, 3 figures, 3 tables)

This paper contains 16 sections, 3 figures, 3 tables.

Introduction
ASV systems
Corpora
Training
Evaluation
Experimental configurations
GMM-UBM and i-vector
DNN-based systems
Metrics
Results
ASV performances
Does ASV progression naturally enhance defense against spoofing attacks?
Insights from the spoofing attacks
Discussion: potential of SASV research
Conclusion and future works
...and 1 more sections

Figures (3)

Figure 1: Average Spoof Equal Error Rates (SPF-EERs) on 29 different spoofing attacks, chronologically displayed using eight automatic speaker verification (ASV) systems. The SPF-EER adopts spoof trials in place of conventional non-target trials, where in a spoof trial, the test utterance is a system-generated voice of the target speaker. Conventional EERs (equivalent to SV-EER in jung2022sasv) of the ASV systems in the Vox1-O evaluation protocol are also reported as a reference.
Figure 2: Detailed analyses on chronologically sorted eight ASV systems. (left): different groups of spoofing attacks. (middle): TTS vs. VC attacks. (right) DNN vs. non-DNN-based spoofing attacks. Group 1: does not involve neural networks. Group 2: only the acoustic model is a neural network. Group 3: acoustic and waveform models both are neural networks. Group 4: non-parametric systems.
Figure 3: Analyses from the perspective of spoofing attacks. (left): averaged ASV results in terms of groups. (middle): five ASV systems' results in terms of groups. (right): averaged ASV results of all 29 spoofing attacks in terms of years.

To what extent can ASV systems naturally defend against spoofing attacks?

TL;DR

Abstract

To what extent can ASV systems naturally defend against spoofing attacks?

Authors

TL;DR

Abstract

Table of Contents

Figures (3)