Table of Contents
Fetching ...

SZU-AFS Antispoofing System for the ASVspoof 5 Challenge

Yuxiong Xu, Jiafeng Zhong, Sengui Zheng, Zefeng Liu, Bin Li

TL;DR

This work presents SZU-AFS, a four-stage anti-spoofing system for ASVspoof 5 Track 1 under open conditions, combining a Wav2Vec2 front-end with an AASIST back-end baseline. It systematically explores data augmentation policies (single-DA, random-DA, cascade-DA) for primary fine-tuning and introduces a gradient norm aware minimization (GAM)-based co-enhancement strategy for secondary fine-tuning, followed by logit-score fusion of top models. The most effective configuration uses RIR-TimeMask cascade augmentations and GAM, yielding top models C11 and C12, whose fusion (D4) achieves strong progress results ($minDCF_{prog}=0.027$, $EER_{prog}=0.99\%$) and final eval performance ($minDCF_{eval}=0.115$, $EER_{eval}=4.04\%$). The findings highlight the value of sophisticated augmentation and optimization strategies to improve generalization in open-condition spoofing detection, with practical implications for robust ASV systems.

Abstract

This paper presents the SZU-AFS anti-spoofing system, designed for Track 1 of the ASVspoof 5 Challenge under open conditions. The system is built with four stages: selecting a baseline model, exploring effective data augmentation (DA) methods for fine-tuning, applying a co-enhancement strategy based on gradient norm aware minimization (GAM) for secondary fine-tuning, and fusing logits scores from the two best-performing fine-tuned models. The system utilizes the Wav2Vec2 front-end feature extractor and the AASIST back-end classifier as the baseline model. During model fine-tuning, three distinct DA policies have been investigated: single-DA, random-DA, and cascade-DA. Moreover, the employed GAM-based co-enhancement strategy, designed to fine-tune the augmented model at both data and optimizer levels, helps the Adam optimizer find flatter minima, thereby boosting model generalization. Overall, the final fusion system achieves a minDCF of 0.115 and an EER of 4.04% on the evaluation set.

SZU-AFS Antispoofing System for the ASVspoof 5 Challenge

TL;DR

This work presents SZU-AFS, a four-stage anti-spoofing system for ASVspoof 5 Track 1 under open conditions, combining a Wav2Vec2 front-end with an AASIST back-end baseline. It systematically explores data augmentation policies (single-DA, random-DA, cascade-DA) for primary fine-tuning and introduces a gradient norm aware minimization (GAM)-based co-enhancement strategy for secondary fine-tuning, followed by logit-score fusion of top models. The most effective configuration uses RIR-TimeMask cascade augmentations and GAM, yielding top models C11 and C12, whose fusion (D4) achieves strong progress results (, ) and final eval performance (, ). The findings highlight the value of sophisticated augmentation and optimization strategies to improve generalization in open-condition spoofing detection, with practical implications for robust ASV systems.

Abstract

This paper presents the SZU-AFS anti-spoofing system, designed for Track 1 of the ASVspoof 5 Challenge under open conditions. The system is built with four stages: selecting a baseline model, exploring effective data augmentation (DA) methods for fine-tuning, applying a co-enhancement strategy based on gradient norm aware minimization (GAM) for secondary fine-tuning, and fusing logits scores from the two best-performing fine-tuned models. The system utilizes the Wav2Vec2 front-end feature extractor and the AASIST back-end classifier as the baseline model. During model fine-tuning, three distinct DA policies have been investigated: single-DA, random-DA, and cascade-DA. Moreover, the employed GAM-based co-enhancement strategy, designed to fine-tune the augmented model at both data and optimizer levels, helps the Adam optimizer find flatter minima, thereby boosting model generalization. Overall, the final fusion system achieves a minDCF of 0.115 and an EER of 4.04% on the evaluation set.
Paper Structure (28 sections, 3 equations, 2 figures, 5 tables, 1 algorithm)

This paper contains 28 sections, 3 equations, 2 figures, 5 tables, 1 algorithm.

Figures (2)

  • Figure 1: Illustration of the SZU-AFS anti-spoofing system. The colored boxes represent four stages of the system, with each stage labeled by model IDs from A to D. The best-performing model in each stage and its ID number are presented in bold. First, a baseline model (A9) was selected, combining the Wav2Vec2 feature extractor with the AASIST classifier. The A9 model was then fine-tuned using the RIR-TimeMask method to obtain the best-augmented model (B5), which was subsequently further fine-tuned using a GAM-based co-enhancement strategy. Finally, the logits scores from the C11 and C12 models were fused using an average score-level fusion method, and the results were submitted for evaluation on the Codalab platform.
  • Figure 2: Illustration of the three different DA policies. To enhance the generalization abilities of the A9 model, we experiment with three distinct DA policies, including single-DA, random-DA, and cascade-DA.