Table of Contents
Fetching ...

Parameter Hierarchical Optimization for Visible-Infrared Person Re-Identification

Zeng YU, Yunxiao Shi

TL;DR

This work tackles visible–infrared person re-identification by reframing parameter optimization as a hierarchical problem. It introduces Parameter Hierarchical Optimization (PHO), which partitions network parameters into directly optimizable and trainable groups, enabling alignment-focused components to be optimized without full-network training. The core methods—Self-Adaptive Alignment Strategy (SAS), Auto-weighted Alignment Learning (AAL), and Cross-modality Consistent Learning (CCL)—together produce cross-modality translation, dimension-aware feature weighting, and translation-consistent representations, respectively, with theoretical underpinnings in closed-form optimization. Empirical results on SYSU-MM01, RegDB, and HITSZ-VCM show competitive or state-of-the-art performance, with ablations confirming the complementary contributions of SAS, AAL, and CCL. Overall, PHO offers a parameter-efficient, effective paradigm for VI-reID with strong practical impact for cross-modality person re-identification tasks.

Abstract

Visible-infrared person re-identification (VI-reID) aims at matching cross-modality pedestrian images captured by disjoint visible or infrared cameras. Existing methods alleviate the cross-modality discrepancies via designing different kinds of network architectures. Different from available methods, in this paper, we propose a novel parameter optimizing paradigm, parameter hierarchical optimization (PHO) method, for the task of VI-ReID. It allows part of parameters to be directly optimized without any training, which narrows the search space of parameters and makes the whole network more easier to be trained. Specifically, we first divide the parameters into different types, and then introduce a self-adaptive alignment strategy (SAS) to automatically align the visible and infrared images through transformation. Considering that features in different dimension have varying importance, we develop an auto-weighted alignment learning (AAL) module that can automatically weight features according to their importance. Importantly, in the alignment process of SAS and AAL, all the parameters are immediately optimized with optimization principles rather than training the whole network, which yields a better parameter training manner. Furthermore, we establish the cross-modality consistent learning (CCL) loss to extract discriminative person representations with translation consistency. We provide both theoretical justification and empirical evidence that our proposed PHO method outperform existing VI-reID approaches.

Parameter Hierarchical Optimization for Visible-Infrared Person Re-Identification

TL;DR

This work tackles visible–infrared person re-identification by reframing parameter optimization as a hierarchical problem. It introduces Parameter Hierarchical Optimization (PHO), which partitions network parameters into directly optimizable and trainable groups, enabling alignment-focused components to be optimized without full-network training. The core methods—Self-Adaptive Alignment Strategy (SAS), Auto-weighted Alignment Learning (AAL), and Cross-modality Consistent Learning (CCL)—together produce cross-modality translation, dimension-aware feature weighting, and translation-consistent representations, respectively, with theoretical underpinnings in closed-form optimization. Empirical results on SYSU-MM01, RegDB, and HITSZ-VCM show competitive or state-of-the-art performance, with ablations confirming the complementary contributions of SAS, AAL, and CCL. Overall, PHO offers a parameter-efficient, effective paradigm for VI-reID with strong practical impact for cross-modality person re-identification tasks.

Abstract

Visible-infrared person re-identification (VI-reID) aims at matching cross-modality pedestrian images captured by disjoint visible or infrared cameras. Existing methods alleviate the cross-modality discrepancies via designing different kinds of network architectures. Different from available methods, in this paper, we propose a novel parameter optimizing paradigm, parameter hierarchical optimization (PHO) method, for the task of VI-ReID. It allows part of parameters to be directly optimized without any training, which narrows the search space of parameters and makes the whole network more easier to be trained. Specifically, we first divide the parameters into different types, and then introduce a self-adaptive alignment strategy (SAS) to automatically align the visible and infrared images through transformation. Considering that features in different dimension have varying importance, we develop an auto-weighted alignment learning (AAL) module that can automatically weight features according to their importance. Importantly, in the alignment process of SAS and AAL, all the parameters are immediately optimized with optimization principles rather than training the whole network, which yields a better parameter training manner. Furthermore, we establish the cross-modality consistent learning (CCL) loss to extract discriminative person representations with translation consistency. We provide both theoretical justification and empirical evidence that our proposed PHO method outperform existing VI-reID approaches.
Paper Structure (25 sections, 3 theorems, 33 equations, 2 figures, 4 tables, 1 algorithm)

This paper contains 25 sections, 3 theorems, 33 equations, 2 figures, 4 tables, 1 algorithm.

Key Result

Theorem 2

Problem $P_{1}(A)$ is minimized iff where $A^\star= (A_{ij}^\star )_{n\times n}$ be the optimum transformation matrix with respect to problem $P_{1}$, $\bar{X}_{j}$ is the $j$-th dimension of $\bar{X}$, and $\bar{Y}_{i}$ is the $i$-th dimension of $\bar{Y}$.

Figures (2)

  • Figure 1: Framework of our PHO method. All the parameters are first divided into non- and direct optimized parameters. Then the direct optimized parameters are immediately optimized with self-adaptive alignment strategy and auto-weighted alignment learning, and the remaining non-direct optimized parameters need to be trained with optimizers. Because partial parameters are directly optimized instead of training, it can reduce the search space of parameters and result in easier training.
  • Figure 2: Visualization of learned features with t-SNE method.

Theorems & Definitions (7)

  • Definition 1
  • Remark 1
  • Theorem 2
  • Remark 2
  • Theorem 3
  • Theorem 4
  • Remark 3