CKDA: Cross-modality Knowledge Disentanglement and Alignment for Visible-Infrared Lifelong Person Re-identification
Zhenyu Cui, Jiahuan Zhou, Yuxin Peng
TL;DR
The paper tackles Visible-Infrared Lifelong Person Re-Identification (VI-LReID), where models must sequentially learn from both visible and infrared data without forgetting previously learned cross-modal knowledge. It introduces CKDA, a Cross-modality Knowledge Disentanglement and Alignment framework that explicitly separates modality-common and modality-specific knowledge using Modality-Common Prompting (MCP) and Modality-Specific Prompting (MSP), followed by Cross-modality Knowledge Aligning (CKA) with dual prototype-based spaces. The optimization combines a base loss with a prompting consistency term and inter-/intra-modality alignment losses: $\mathcal{L} = \mathcal{L}_{base} + \alpha L_p + \beta(\mu L_{inter} + (1-\mu) L_{intra})$. Experiments on four VI-LReID benchmarks show CKDA achieving state-of-the-art performance with notable anti-forgetting effects, illustrating the practical value of explicit knowledge disentanglement and balanced cross-modal alignment for day-night pedestrian re-identification.
Abstract
Lifelong person Re-IDentification (LReID) aims to match the same person employing continuously collected individual data from different scenarios. To achieve continuous all-day person matching across day and night, Visible-Infrared Lifelong person Re-IDentification (VI-LReID) focuses on sequential training on data from visible and infrared modalities and pursues average performance over all data. To this end, existing methods typically exploit cross-modal knowledge distillation to alleviate the catastrophic forgetting of old knowledge. However, these methods ignore the mutual interference of modality-specific knowledge acquisition and modality-common knowledge anti-forgetting, where conflicting knowledge leads to collaborative forgetting. To address the above problems, this paper proposes a Cross-modality Knowledge Disentanglement and Alignment method, called CKDA, which explicitly separates and preserves modality-specific knowledge and modality-common knowledge in a balanced way. Specifically, a Modality-Common Prompting (MCP) module and a Modality-Specific Prompting (MSP) module are proposed to explicitly disentangle and purify discriminative information that coexists and is specific to different modalities, avoiding the mutual interference between both knowledge. In addition, a Cross-modal Knowledge Alignment (CKA) module is designed to further align the disentangled new knowledge with the old one in two mutually independent inter- and intra-modality feature spaces based on dual-modality prototypes in a balanced manner. Extensive experiments on four benchmark datasets verify the effectiveness and superiority of our CKDA against state-of-the-art methods. The source code of this paper is available at https://github.com/PKU-ICST-MIPL/CKDA-AAAI2026.
