Table of Contents
Fetching ...

MsMemoryGAN: A Multi-scale Memory GAN for Palm-vein Adversarial Purification

Huafeng Qin, Yuming Fu, Huiyan Zhang, Mounim A. El-Yacoubi, Xinbo Gao, Qun Song, Jun Wang

TL;DR

This work tackles adversarial perturbations in palm-vein recognition by introducing MsMemoryGAN, a defense that purifies inputs through reconstruction from prototypical normal vein patterns stored in memory. It combines a multi-scale memory-augmented autoencoder with a patch-based discriminator within a GAN framework, employing a learnable memory-addressing metric and sparse memory usage to restrict reconstruction to normal patterns. The method optimizes a composite loss that includes $L_1$, a perceptual loss $L_p$ based on ResNet features, a sparsity term $L_s$, and an adversarial loss with an adaptive weight, yielding purified outputs that preserve vein structure while removing perturbations. Extensive experiments on TJU_PV and PolyU_MN under white-box and black-box attacks show state-of-the-art defense performance, with high recognition accuracy and substantial gains over competing defenses, highlighting the approach’s practical impact for secure vein biometrics in adversarial settings. Overall, MsMemoryGAN offers a scalable, memory-driven purification mechanism that enhances robustness in vein recognition without sacrificing detailed vein patterns.

Abstract

Deep neural networks have recently achieved promising performance in the vein recognition task and have shown an increasing application trend, however, they are prone to adversarial perturbation attacks by adding imperceptible perturbations to the input, resulting in making incorrect recognition. To address this issue, we propose a novel defense model named MsMemoryGAN, which aims to filter the perturbations from adversarial samples before recognition. First, we design a multi-scale autoencoder to achieve high-quality reconstruction and two memory modules to learn the detailed patterns of normal samples at different scales. Second, we investigate a learnable metric in the memory module to retrieve the most relevant memory items to reconstruct the input image. Finally, the perceptional loss is combined with the pixel loss to further enhance the quality of the reconstructed image. During the training phase, the MsMemoryGAN learns to reconstruct the input by merely using fewer prototypical elements of the normal patterns recorded in the memory. At the testing stage, given an adversarial sample, the MsMemoryGAN retrieves its most relevant normal patterns in memory for the reconstruction. Perturbations in the adversarial sample are usually not reconstructed well, resulting in purifying the input from adversarial perturbations. We have conducted extensive experiments on two public vein datasets under different adversarial attack methods to evaluate the performance of the proposed approach. The experimental results show that our approach removes a wide variety of adversarial perturbations, allowing vein classifiers to achieve the highest recognition accuracy.

MsMemoryGAN: A Multi-scale Memory GAN for Palm-vein Adversarial Purification

TL;DR

This work tackles adversarial perturbations in palm-vein recognition by introducing MsMemoryGAN, a defense that purifies inputs through reconstruction from prototypical normal vein patterns stored in memory. It combines a multi-scale memory-augmented autoencoder with a patch-based discriminator within a GAN framework, employing a learnable memory-addressing metric and sparse memory usage to restrict reconstruction to normal patterns. The method optimizes a composite loss that includes , a perceptual loss based on ResNet features, a sparsity term , and an adversarial loss with an adaptive weight, yielding purified outputs that preserve vein structure while removing perturbations. Extensive experiments on TJU_PV and PolyU_MN under white-box and black-box attacks show state-of-the-art defense performance, with high recognition accuracy and substantial gains over competing defenses, highlighting the approach’s practical impact for secure vein biometrics in adversarial settings. Overall, MsMemoryGAN offers a scalable, memory-driven purification mechanism that enhances robustness in vein recognition without sacrificing detailed vein patterns.

Abstract

Deep neural networks have recently achieved promising performance in the vein recognition task and have shown an increasing application trend, however, they are prone to adversarial perturbation attacks by adding imperceptible perturbations to the input, resulting in making incorrect recognition. To address this issue, we propose a novel defense model named MsMemoryGAN, which aims to filter the perturbations from adversarial samples before recognition. First, we design a multi-scale autoencoder to achieve high-quality reconstruction and two memory modules to learn the detailed patterns of normal samples at different scales. Second, we investigate a learnable metric in the memory module to retrieve the most relevant memory items to reconstruct the input image. Finally, the perceptional loss is combined with the pixel loss to further enhance the quality of the reconstructed image. During the training phase, the MsMemoryGAN learns to reconstruct the input by merely using fewer prototypical elements of the normal patterns recorded in the memory. At the testing stage, given an adversarial sample, the MsMemoryGAN retrieves its most relevant normal patterns in memory for the reconstruction. Perturbations in the adversarial sample are usually not reconstructed well, resulting in purifying the input from adversarial perturbations. We have conducted extensive experiments on two public vein datasets under different adversarial attack methods to evaluate the performance of the proposed approach. The experimental results show that our approach removes a wide variety of adversarial perturbations, allowing vein classifiers to achieve the highest recognition accuracy.
Paper Structure (21 sections, 21 equations, 10 figures, 4 tables)

This paper contains 21 sections, 21 equations, 10 figures, 4 tables.

Figures (10)

  • Figure 1: Adversarial attack results. The perturbation is generated by FSGM attack goodfellow2014explaining with Vit dosovitskiy2010image classifier. Class A represents the correct class and class B represents the incorrect class. The confidence score of the original image belongs to class A is 98.9%, after adding perturbations to the original image, the resulting adversarial image is classified to class A with a confidence score of 11.2% while it is misclassified to class B with 80.8% probability. After feeding it into our model for purification, the confidence of the resulting purifier images belonging to class A is increased to 98.5%
  • Figure 2: The architecture of the proposed MsMemoryGAN
  • Figure 3: The architecture of top encoder (a) and bottom encoder (b). The former consists of seven convolutional layers while the latter includes six convolutional layers. Some layers in both encoders are stacked with residual connections. The top encoder transforms and downsamples the image by a factor of 4, while the bottom encoder transforms and downsamples the image by a factor of 2.
  • Figure 4: The architecture of top decoder (a) and bottom decoder (b). The former consists of seven convolutional layers, while the latter includes six convolutional layers. Note that some convolutional layers are stacked with residual connections. The top decoder transforms and upsamples the input vector by a factor of 4, while the bottom decoder transforms and upsamples the input by a factor of 2.
  • Figure 5: The memory module architecture. Given a feature vector, we first use a convolutional layer to reduce its dimensionality, and then find the most relevant codes in memory by an MLP network, based on which we can obtain the representations of input on memory. Finally, such representation is input to the convolutional layer for reconstruction.
  • ...and 5 more figures