Rethinking Clothes Changing Person ReID: Conflicts, Synthesis, and Optimization
Junjie Li, Guanshuo Wang, Fufu Yu, Yichao Yan, Qiong Jia, Shouhong Ding, Xingdong Sheng, Yunhui Liu, Xiaokang Yang
TL;DR
This work tackles CC-ReID by revealing intrinsic conflicts between clothing-invariant identity cues and clothing-dependent features, and by proposing a two-pronged solution: high-fidelity clothes-varying data synthesis via Clothes-Changing Diffusion (CC-Diffusion) and a multi-objective optimization (MOO) framework to balance competing objectives. The CC-Diffusion model generates controllable, identity-consistent clothes-varying images to augment CC data, while the CC-ReID learning is reformulated into three objectives: $\mathcal{L}_{id}$, $\mathcal{L}_{sc}$, and $\mathcal{L}_{cc}$, optimized through gradient-based Pareto methods and guided by human preference vectors to achieve desired trade-offs. Key contributions include exposing the objective conflict in CC-ReID, introducing high-quality synthetic data, and delivering a model-agnostic MOO solution that yields Pareto-optimal and practically balanced performance under both CC and standard ReID protocols. The approach demonstrates significant CC improvements with controlled SC performance across PRCC and CCVID datasets, highlighting the practical impact of combining data synthesis with principled multi-objective optimization in CC-ReID.
Abstract
Clothes-changing person re-identification (CC-ReID) aims to retrieve images of the same person wearing different outfits. Mainstream researches focus on designing advanced model structures and strategies to capture identity information independent of clothing. However, the same-clothes discrimination as the standard ReID learning objective in CC-ReID is persistently ignored in previous researches. In this study, we dive into the relationship between standard and clothes-changing~(CC) learning objectives, and bring the inner conflicts between these two objectives to the fore. We try to magnify the proportion of CC training pairs by supplementing high-fidelity clothes-varying synthesis, produced by our proposed Clothes-Changing Diffusion model. By incorporating the synthetic images into CC-ReID model training, we observe a significant improvement under CC protocol. However, such improvement sacrifices the performance under the standard protocol, caused by the inner conflict between standard and CC. For conflict mitigation, we decouple these objectives and re-formulate CC-ReID learning as a multi-objective optimization (MOO) problem. By effectively regularizing the gradient curvature across multiple objectives and introducing preference restrictions, our MOO solution surpasses the single-task training paradigm. Our framework is model-agnostic, and demonstrates superior performance under both CC and standard ReID protocols.
