Table of Contents
Fetching ...

Text-dependent Speaker Verification (TdSV) Challenge 2024: Challenge Evaluation Plan

Zeinali Hossein, Lee Kong Aik, Alam Jahangir, Burget Lukas

TL;DR

The TdSV Challenge 2024 targets robust text-dependent speaker verification in two tracks: conventional TD-SV and user-defined passphrases, using the DeepMine Persian-English corpus to evaluate fixed-phrase and passphrase-content verification under strict training, enrollment, and testing rules. The approach emphasizes building a single, competitive system while enabling thorough analysis and integration of techniques such as multi-task and self-supervised learning, with performance tracked by normalized minDCF from SRE08 ($C_{Det}=C_{Miss} \times P_{Miss|Target} \times P_{Target} + C_{FalseAlarm} \times P_{FalseAlarm|NonTarget} \times (1 - P_{Target})$, where $C_{Miss}=10$, $C_{FalseAlarm}=1$, $P_{Target}=0.01$, and $DCF_{norm}=DCF/0.1$), alongside EER. Data management combines fixed in-domain and diverse public data under a fixed training condition, with standardized enrollment/test formats and a rigorous evaluation protocol via Codabench and a public leaderboard, fostering reproducibility and practical impact for secure, user-specific voice verification systems. The plan also specifies a common ECAPA-TDNN baseline, prize incentives, and a detailed schedule to catalyze timely submissions and system descriptions for SLT dissemination.

Abstract

This document outlines the Text-dependent Speaker Verification (TdSV) Challenge 2024, which centers on analyzing and exploring novel approaches for text-dependent speaker verification. The primary goal of this challenge is to motive participants to develop single yet competitive systems, conduct thorough analyses, and explore innovative concepts such as multi-task learning, self-supervised learning, few-shot learning, and others, for text-dependent speaker verification.

Text-dependent Speaker Verification (TdSV) Challenge 2024: Challenge Evaluation Plan

TL;DR

The TdSV Challenge 2024 targets robust text-dependent speaker verification in two tracks: conventional TD-SV and user-defined passphrases, using the DeepMine Persian-English corpus to evaluate fixed-phrase and passphrase-content verification under strict training, enrollment, and testing rules. The approach emphasizes building a single, competitive system while enabling thorough analysis and integration of techniques such as multi-task and self-supervised learning, with performance tracked by normalized minDCF from SRE08 (, where , , , and ), alongside EER. Data management combines fixed in-domain and diverse public data under a fixed training condition, with standardized enrollment/test formats and a rigorous evaluation protocol via Codabench and a public leaderboard, fostering reproducibility and practical impact for secure, user-specific voice verification systems. The plan also specifies a common ECAPA-TDNN baseline, prize incentives, and a detailed schedule to catalyze timely submissions and system descriptions for SLT dissemination.

Abstract

This document outlines the Text-dependent Speaker Verification (TdSV) Challenge 2024, which centers on analyzing and exploring novel approaches for text-dependent speaker verification. The primary goal of this challenge is to motive participants to develop single yet competitive systems, conduct thorough analyses, and explore innovative concepts such as multi-task learning, self-supervised learning, few-shot learning, and others, for text-dependent speaker verification.

Paper Structure

This paper contains 30 sections, 1 equation, 2 tables.