Cross-Modality Perturbation Synergy Attack for Person Re-identification

Yunpeng Gong; Zhun Zhong; Yansong Qu; Zhiming Luo; Rongrong Ji; Min Jiang

Cross-Modality Perturbation Synergy Attack for Person Re-identification

Yunpeng Gong, Zhun Zhong, Yansong Qu, Zhiming Luo, Rongrong Ji, Min Jiang

TL;DR

This work introduces the Cross-Modality Perturbation Synergy (CMPS) attack to expose security vulnerabilities in cross-modality ReID systems, uniting gradients from RGB, grayscale, and infrared modalities to craft a universal perturbation $\eta$. A cross-modality attack augmentation using grayscale transforms bridges modality gaps, while a cross-modality triplet loss coordinates cross-domain feature relationships. The authors demonstrate dramatic reductions in Rank-1 accuracy across SYSU, RegDB, and LLCM benchmarks and show strong transferability across models, outperforming state-of-the-art cross-modality attacks. The work also provides theoretical support for aggregated optimization over separate modality updates, suggesting lower generalization error and superior perturbation universality. This study highlights the need for robust defenses in multi-sensor ReID systems and points to future defense-focused research to counter cross-modality adversarial threats.

Abstract

In recent years, there has been significant research focusing on addressing security concerns in single-modal person re-identification (ReID) systems that are based on RGB images. However, the safety of cross-modality scenarios, which are more commonly encountered in practical applications involving images captured by infrared cameras, has not received adequate attention. The main challenge in cross-modality ReID lies in effectively dealing with visual differences between different modalities. For instance, infrared images are typically grayscale, unlike visible images that contain color information. Existing attack methods have primarily focused on the characteristics of the visible image modality, overlooking the features of other modalities and the variations in data distribution among different modalities. This oversight can potentially undermine the effectiveness of these methods in image retrieval across diverse modalities. This study represents the first exploration into the security of cross-modality ReID models and proposes a universal perturbation attack specifically designed for cross-modality ReID. This attack optimizes perturbations by leveraging gradients from diverse modality data, thereby disrupting the discriminator and reinforcing the differences between modalities. We conducted experiments on three widely used cross-modality datasets, namely RegDB, SYSU, and LLCM. The results not only demonstrate the effectiveness of our method but also provide insights for future improvements in the robustness of cross-modality ReID systems. The code will be available at https://github.com/finger-monkey/cmps__attack.

Cross-Modality Perturbation Synergy Attack for Person Re-identification

TL;DR

. A cross-modality attack augmentation using grayscale transforms bridges modality gaps, while a cross-modality triplet loss coordinates cross-domain feature relationships. The authors demonstrate dramatic reductions in Rank-1 accuracy across SYSU, RegDB, and LLCM benchmarks and show strong transferability across models, outperforming state-of-the-art cross-modality attacks. The work also provides theoretical support for aggregated optimization over separate modality updates, suggesting lower generalization error and superior perturbation universality. This study highlights the need for robust defenses in multi-sensor ReID systems and points to future defense-focused research to counter cross-modality adversarial threats.

Abstract

Paper Structure (22 sections, 34 equations, 7 figures, 6 tables, 1 algorithm)

This paper contains 22 sections, 34 equations, 7 figures, 6 tables, 1 algorithm.

Introduction
Related Works
Methodology
Overall Framework
Optimizing Loss Functions for Attacking
Cross-Modality Attack Augmentation Method
Cross-Modality Perturbation Synergy Attack
Experiments
Performance on Cross-Modality ReID
Transferability of CMPS
Ablation Study
Conclusion
Supplemental Experiments
Proof of Method Superioritys
Definition of Cross-Modality Triplet Loss
...and 7 more sections

Figures (7)

Figure 1: Comparison between traditional and proposed methods: Fig.(a) illustrates traditional attack methods (e.g., FGSM fgsm, PGD madry2018towards), which are primarily designed for single-modal tasks and lack mechanisms to associate multiple modalities, making them ineffective in simultaneously misleading retrieval results across different modalities. Fig.(b) illustrates the proposed method, which employs an intrinsic mechanism to effectively associate different modalities, thereby misleading retrieval results across multiple modalities simultaneously.
Figure 2: Illustration of the CMPS attack framework. We generate homogeneous grayscale images through random grayscale transformations to reduce the differences between modalities, aiding in the learning of a universal perturbation. The process is as follows: first, the gradient from one modality is used to optimize the universal perturbation, which is then applied to another modality's images to generate adversarial samples for attacks. The new modality's gradient is then used to further optimize the perturbation and attack the next modality. By aggregating feature gradients from different modalities, we iteratively learn a universal perturbation, pushing samples toward a common region in the manifold. The manifold is represented as a sphere, with identical shapes but different colors representing the same person's features across modalities. This method captures shared knowledge between modalities, enabling more effective learning of cross-modal universal perturbations.
Figure 3: Schematic illustration of triplet relationship-guided universal perturbation learning for cross-modality ReID.
Figure 4: Cross-modality attack augmentation: bridging gap between visible and non-visible (infrared) modalities with grayscale.
Figure 5: The impact of different grayscale transformation probabilities on attack performance. Lower evaluation metrics indicate higher attack success rates. The experimental results are derived from experiments on the RegDB dataset using AGW as the baseline model for testing.
...and 2 more figures

Cross-Modality Perturbation Synergy Attack for Person Re-identification

TL;DR

Abstract

Cross-Modality Perturbation Synergy Attack for Person Re-identification

Authors

TL;DR

Abstract

Table of Contents

Figures (7)