Parameter Hierarchical Optimization for Visible-Infrared Person Re-Identification
Zeng YU, Yunxiao Shi
TL;DR
This work tackles visible–infrared person re-identification by reframing parameter optimization as a hierarchical problem. It introduces Parameter Hierarchical Optimization (PHO), which partitions network parameters into directly optimizable and trainable groups, enabling alignment-focused components to be optimized without full-network training. The core methods—Self-Adaptive Alignment Strategy (SAS), Auto-weighted Alignment Learning (AAL), and Cross-modality Consistent Learning (CCL)—together produce cross-modality translation, dimension-aware feature weighting, and translation-consistent representations, respectively, with theoretical underpinnings in closed-form optimization. Empirical results on SYSU-MM01, RegDB, and HITSZ-VCM show competitive or state-of-the-art performance, with ablations confirming the complementary contributions of SAS, AAL, and CCL. Overall, PHO offers a parameter-efficient, effective paradigm for VI-reID with strong practical impact for cross-modality person re-identification tasks.
Abstract
Visible-infrared person re-identification (VI-reID) aims at matching cross-modality pedestrian images captured by disjoint visible or infrared cameras. Existing methods alleviate the cross-modality discrepancies via designing different kinds of network architectures. Different from available methods, in this paper, we propose a novel parameter optimizing paradigm, parameter hierarchical optimization (PHO) method, for the task of VI-ReID. It allows part of parameters to be directly optimized without any training, which narrows the search space of parameters and makes the whole network more easier to be trained. Specifically, we first divide the parameters into different types, and then introduce a self-adaptive alignment strategy (SAS) to automatically align the visible and infrared images through transformation. Considering that features in different dimension have varying importance, we develop an auto-weighted alignment learning (AAL) module that can automatically weight features according to their importance. Importantly, in the alignment process of SAS and AAL, all the parameters are immediately optimized with optimization principles rather than training the whole network, which yields a better parameter training manner. Furthermore, we establish the cross-modality consistent learning (CCL) loss to extract discriminative person representations with translation consistency. We provide both theoretical justification and empirical evidence that our proposed PHO method outperform existing VI-reID approaches.
