Table of Contents
Fetching ...

Alpha Divergence Losses for Biometric Verification

Dimitrios Koutsianos, Ladislav Mosner, Yannis Panagakis, Themos Stafylakis

TL;DR

This work reframes margin-based biometric verification through the lens of α-divergence losses, enabling sparse posteriors when $α>1$ and introducing two practical strategies to integrate angular margins: Q-Margin, which encodes the margin in the reference measure, and Alpha-Additive-Angular Margin (A3M), which applies the margin to the logits. To address optimization instability caused by sparsity, the authors add A3M-I, a mid-training prototype re-initialization that realigns identity prototypes with embeddings. Across face and speaker verification benchmarks, including IJB-B, IJB-C, and VoxCeleb, the proposed methods yield substantial gains at low false-acceptance rates, while preserving extreme posterior sparsity that suggests potential memory-efficient training. The results demonstrate that combining probabilistic margins with geometric margins can improve verification performance in large-scale, real-world scenarios with millions of identities.

Abstract

Performance in face and speaker verification is largely driven by margin-based softmax losses such as CosFace and ArcFace. Recently introduced $α$-divergence loss functions offer a compelling alternative, particularly due to their ability to induce sparse solutions (when $α>1$). However, integrating an angular margin-crucial for verification tasks-is not straightforward. We find that this integration can be achieved in at least two distinct ways: via the reference measure (prior probabilities) or via the logits (unnormalized log-likelihoods). In this paper, we explore both pathways, deriving two novel margin-based $α$-divergence losses: Q-Margin (margin in the reference measure) and A3M (margin in the logits). We identify and address a training instability in A3M-caused by sparsity-with a simple yet effective prototype re-initialization strategy. Our methods achieve significant performance gains on the challenging IJB-B and IJB-C face verification benchmarks. We demonstrate similarly strong performance in speaker verification on VoxCeleb. Crucially, our models significantly outperform strong baselines at low false acceptance rates (FAR). This capability is critical for practical high-security applications, such as banking authentication, when minimizing false authentications is paramount. Finally, the sparsity of $α$-divergence-based posteriors enables memory-efficient training, which is crucial for datasets with millions of identities.

Alpha Divergence Losses for Biometric Verification

TL;DR

This work reframes margin-based biometric verification through the lens of α-divergence losses, enabling sparse posteriors when and introducing two practical strategies to integrate angular margins: Q-Margin, which encodes the margin in the reference measure, and Alpha-Additive-Angular Margin (A3M), which applies the margin to the logits. To address optimization instability caused by sparsity, the authors add A3M-I, a mid-training prototype re-initialization that realigns identity prototypes with embeddings. Across face and speaker verification benchmarks, including IJB-B, IJB-C, and VoxCeleb, the proposed methods yield substantial gains at low false-acceptance rates, while preserving extreme posterior sparsity that suggests potential memory-efficient training. The results demonstrate that combining probabilistic margins with geometric margins can improve verification performance in large-scale, real-world scenarios with millions of identities.

Abstract

Performance in face and speaker verification is largely driven by margin-based softmax losses such as CosFace and ArcFace. Recently introduced -divergence loss functions offer a compelling alternative, particularly due to their ability to induce sparse solutions (when ). However, integrating an angular margin-crucial for verification tasks-is not straightforward. We find that this integration can be achieved in at least two distinct ways: via the reference measure (prior probabilities) or via the logits (unnormalized log-likelihoods). In this paper, we explore both pathways, deriving two novel margin-based -divergence losses: Q-Margin (margin in the reference measure) and A3M (margin in the logits). We identify and address a training instability in A3M-caused by sparsity-with a simple yet effective prototype re-initialization strategy. Our methods achieve significant performance gains on the challenging IJB-B and IJB-C face verification benchmarks. We demonstrate similarly strong performance in speaker verification on VoxCeleb. Crucially, our models significantly outperform strong baselines at low false acceptance rates (FAR). This capability is critical for practical high-security applications, such as banking authentication, when minimizing false authentications is paramount. Finally, the sparsity of -divergence-based posteriors enables memory-efficient training, which is crucial for datasets with millions of identities.

Paper Structure

This paper contains 24 sections, 11 equations, 3 figures, 3 tables.

Figures (3)

  • Figure 1: False acceptance vs missed detection probabilities for various losses on IJB-B and IJB-C, focused on low false acceptance operating points.A3M-I and Q-Margin-I are configurations of A3M and Q-Margin, respectively, with Prototype Re-Initialization.
  • Figure 2: Challenging pairs from AgeDB-30, CALFW, and CPLFW where our proposed method (Q-Margin, $\alpha$=1.25, s=32, m=0.2) succeeds, while the ArcFace baseline fails. The top row shows genuine pairs correctly accepted by our model and falsely rejected by the baseline. The bottom row shows impostor pairs correctly rejected by our model and falsely accepted by the baseline.
  • Figure 3: False acceptance vs missed detection probabilities for various losses on VoxCeleb1-H, focused on low false acceptance operating points. A3M-I is a configuration of $\textsc{A3M}$ with Prototype re-initialization.