BUT Systems and Analyses for the ASVspoof 5 Challenge
Johan Rohdin, Lin Zhang, Oldřich Plchot, Vojtěch Staněk, David Mihola, Junyi Peng, Themos Stafylakis, Dmitriy Beveraki, Anna Silnova, Jan Brukner, Lukáš Burget
TL;DR
This work presents BUT's submissions for ASVspoof 5, detailing track 1 deepfake detection systems: a ResNet18-based closed-condition model with explored label schemes and a pretrained SSL front-end with MHFA pooling for the open condition. For track 2 SASV, it introduces a generalized SASV LLR framework built on effective priors and calibrates CM and ASV scores via logistic regression to optimize decision costs. Key contributions include a systematic analysis of labeling schemes for deepfake detection, effective priors for SASV decisions, and a discriminative calibration approach that improves SASV metrics across closed and open conditions. The results demonstrate competitive performance in both tracks and highlight the importance of calibration-aware fusion in SASV, with practical implications for robust spoofing-robust verification systems.
Abstract
This paper describes the BUT submitted systems for the ASVspoof 5 challenge, along with analyses. For the conventional deepfake detection task, we use ResNet18 and self-supervised models for the closed and open conditions, respectively. In addition, we analyze and visualize different combinations of speaker information and spoofing information as label schemes for training. For spoofing-robust automatic speaker verification (SASV), we introduce effective priors and propose using logistic regression to jointly train affine transformations of the countermeasure scores and the automatic speaker verification scores in such a way that the SASV LLR is optimized.
