Test-Time Domain Generalization for Face Anti-Spoofing

Qianyu Zhou; Ke-Yue Zhang; Taiping Yao; Xuequan Lu; Shouhong Ding; Lizhuang Ma

Test-Time Domain Generalization for Face Anti-Spoofing

Qianyu Zhou, Ke-Yue Zhang, Taiping Yao, Xuequan Lu, Shouhong Ding, Lizhuang Ma

TL;DR

This paper tackles robust face anti-spoofing under domain shifts by proposing TTDG, a framework that uses test-time data to improve generalization without updating the model at test time. It combines Test-Time Style Projection (TTSP) to map unseen styles into the source-space basis and Diverse Style Shifts Simulation (DSSS) to generate diverse shifts from learnable bases in a hyperspherical feature space, guided by the style and content losses. The theoretical underpinning relies on $d_{\mathcal{H}}$-divergence and the source convex hull $\Lambda_s$, arguing that reducing the gap $\gamma$ between the ideal target and the real target improves the target risk $\epsilon_t(h)$. Across four cross-domain FAS benchmarks, TT DG achieves state-of-the-art results with no test-time updates, and the approach remains compatible with both CNN and ViT backbones, offering practical robustness for real-world deployments.

Abstract

Face Anti-Spoofing (FAS) is pivotal in safeguarding facial recognition systems against presentation attacks. While domain generalization (DG) methods have been developed to enhance FAS performance, they predominantly focus on learning domain-invariant features during training, which may not guarantee generalizability to unseen data that differs largely from the source distributions. Our insight is that testing data can serve as a valuable resource to enhance the generalizability beyond mere evaluation for DG FAS. In this paper, we introduce a novel Test-Time Domain Generalization (TTDG) framework for FAS, which leverages the testing data to boost the model's generalizability. Our method, consisting of Test-Time Style Projection (TTSP) and Diverse Style Shifts Simulation (DSSS), effectively projects the unseen data to the seen domain space. In particular, we first introduce the innovative TTSP to project the styles of the arbitrarily unseen samples of the testing distribution to the known source space of the training distributions. We then design the efficient DSSS to synthesize diverse style shifts via learnable style bases with two specifically designed losses in a hyperspherical feature space. Our method eliminates the need for model updates at the test time and can be seamlessly integrated into not only the CNN but also ViT backbones. Comprehensive experiments on widely used cross-domain FAS benchmarks demonstrate our method's state-of-the-art performance and effectiveness.

Test-Time Domain Generalization for Face Anti-Spoofing

TL;DR

-divergence and the source convex hull

, arguing that reducing the gap

between the ideal target and the real target improves the target risk

. Across four cross-domain FAS benchmarks, TT DG achieves state-of-the-art results with no test-time updates, and the approach remains compatible with both CNN and ViT backbones, offering practical robustness for real-world deployments.

Abstract

Paper Structure (14 sections, 13 equations, 5 figures, 6 tables)

This paper contains 14 sections, 13 equations, 5 figures, 6 tables.

Introduction
Related Work
Methodology
Theoretical Analysis
Test-Time Style Projection
Diverse Style Shifts Simulation
Training and Inference
Experiments
Experimental Setup
Comparisons to the State-of-the-art Methods
Ablation Studies
Visualization and Analysis
Conclusion
Acknowledgement

Figures (5)

Figure 1: Conventional DG FAS approaches typically learn domain-invariant features at train time but cannot guarantee generalizability to unseen data that largely differ from source domains. In contrast, we propose test-time DG for FAS that projects the unseen testing data to the seen space, thus enhancing the generalizability of FAS model without any model updates at test time.
Figure 2: Overview of the proposed Test-Time Domain Generalization (TTDG) framework for DG FAS. In particular, we first introduce Test-Time Style Projection (TTSP) to project arbitrarily unseen samples to the known source space based on the similarity between the unseen sample and the style bases. We then design Diverse Style Shifts Simulation (DSSS) to synthesize diverse style shifts via learnable style bases. $\mathcal{L}_{\text{sty}}$ and $\mathcal{L}_{\text{con}}$ are two new losses for maximizing the style diversity and content consistency in a hyperspherical feature space. Our TTDG eliminates the need for model updates at test time and can be seamlessly integrated into the CNN and ViT backbones.
Figure 3: Comparison results of t-SNE van2008visualizing feature visualization for train-time DG and our test-time DG method.
Figure 4: Hyper-parameter analyses on the O&C&M to I setting.
Figure 5: T-SNE van2008visualizing visualization of features for different domains before (a) and after test-time style projection (b).

Test-Time Domain Generalization for Face Anti-Spoofing

TL;DR

Abstract

Test-Time Domain Generalization for Face Anti-Spoofing

Authors

TL;DR

Abstract

Table of Contents

Figures (5)