Table of Contents
Fetching ...

DiffVein: A Unified Diffusion Network for Finger Vein Segmentation and Authentication

Yanjun Liu, Wenming Yang, Qingmin Liao

TL;DR

DiffVein introduces a unified diffusion-model framework for finger vein segmentation and authentication, coupling a segmentation branch with a denoising diffusion path to enable mutual information exchange. It adds a mask condition to guide denoising and a Semantic Difference Transformer to fuse diffusion-derived category embeddings into segmentation, guided by a Fourier-space Structural Similarity loss. Across USM and THU-MVFV3V datasets, DiffVein achieves state-of-the-art performance in both verification (EER as low as $0.089\%$) and identification (ACC up to $99.79\%$) while delivering superior segmentation topology (high clDice scores). These results demonstrate the practical potential of cross-task diffusion-driven biometric recognition, with ablation confirming the contributions of diffusion, conditioning, FourierSIM, and SD-Former to overall gains.

Abstract

Finger vein authentication, recognized for its high security and specificity, has become a focal point in biometric research. Traditional methods predominantly concentrate on vein feature extraction for discriminative modeling, with a limited exploration of generative approaches. Suffering from verification failure, existing methods often fail to obtain authentic vein patterns by segmentation. To fill this gap, we introduce DiffVein, a unified diffusion model-based framework which simultaneously addresses vein segmentation and authentication tasks. DiffVein is composed of two dedicated branches: one for segmentation and the other for denoising. For better feature interaction between these two branches, we introduce two specialized modules to improve their collective performance. The first, a mask condition module, incorporates the semantic information of vein patterns from the segmentation branch into the denoising process. Additionally, we also propose a Semantic Difference Transformer (SD-Former), which employs Fourier-space self-attention and cross-attention modules to extract category embedding before feeding it to the segmentation task. In this way, our framework allows for a dynamic interplay between diffusion and segmentation embeddings, thus vein segmentation and authentication tasks can inform and enhance each other in the joint training. To further optimize our model, we introduce a Fourier-space Structural Similarity (FourierSIM) loss function, which is tailored to improve the denoising network's learning efficacy. Extensive experiments on the USM and THU-MVFV3V datasets substantiates DiffVein's superior performance, setting new benchmarks in both vein segmentation and authentication tasks.

DiffVein: A Unified Diffusion Network for Finger Vein Segmentation and Authentication

TL;DR

DiffVein introduces a unified diffusion-model framework for finger vein segmentation and authentication, coupling a segmentation branch with a denoising diffusion path to enable mutual information exchange. It adds a mask condition to guide denoising and a Semantic Difference Transformer to fuse diffusion-derived category embeddings into segmentation, guided by a Fourier-space Structural Similarity loss. Across USM and THU-MVFV3V datasets, DiffVein achieves state-of-the-art performance in both verification (EER as low as ) and identification (ACC up to ) while delivering superior segmentation topology (high clDice scores). These results demonstrate the practical potential of cross-task diffusion-driven biometric recognition, with ablation confirming the contributions of diffusion, conditioning, FourierSIM, and SD-Former to overall gains.

Abstract

Finger vein authentication, recognized for its high security and specificity, has become a focal point in biometric research. Traditional methods predominantly concentrate on vein feature extraction for discriminative modeling, with a limited exploration of generative approaches. Suffering from verification failure, existing methods often fail to obtain authentic vein patterns by segmentation. To fill this gap, we introduce DiffVein, a unified diffusion model-based framework which simultaneously addresses vein segmentation and authentication tasks. DiffVein is composed of two dedicated branches: one for segmentation and the other for denoising. For better feature interaction between these two branches, we introduce two specialized modules to improve their collective performance. The first, a mask condition module, incorporates the semantic information of vein patterns from the segmentation branch into the denoising process. Additionally, we also propose a Semantic Difference Transformer (SD-Former), which employs Fourier-space self-attention and cross-attention modules to extract category embedding before feeding it to the segmentation task. In this way, our framework allows for a dynamic interplay between diffusion and segmentation embeddings, thus vein segmentation and authentication tasks can inform and enhance each other in the joint training. To further optimize our model, we introduce a Fourier-space Structural Similarity (FourierSIM) loss function, which is tailored to improve the denoising network's learning efficacy. Extensive experiments on the USM and THU-MVFV3V datasets substantiates DiffVein's superior performance, setting new benchmarks in both vein segmentation and authentication tasks.
Paper Structure (27 sections, 10 equations, 11 figures, 7 tables)

This paper contains 27 sections, 10 equations, 11 figures, 7 tables.

Figures (11)

  • Figure 1: Examples of finger vein segmentation based on traditional methods.
  • Figure 2: Examples of thick and thin vein patterns extracted by traditional methods.
  • Figure 3: The schematic illustration of DiffVein.
  • Figure 4: Illustration of Fourier-space attention modules.
  • Figure 5: Illustration of SD-Former architecture.
  • ...and 6 more figures