Group-CLIP Uncertainty Modeling for Group Re-Identification
Qingxin Zhang, Haoyan Wei, Yang Qian
TL;DR
This paper tackles Group ReID under uncertain group configurations by applying CLIP with uncertainty modeling. It introduces three components—Member Variant Simulation (MVS), Group Layout Adaptation (GLA), and Group Relationship Construction Encoder (GRCE)—and employs a two-stage training strategy to align visual and text representations across varying group sizes and layouts. The approach yields state-of-the-art results on iLIDS-MCTS, RoadGroup, and CSG datasets, demonstrating that textual uncertainty descriptions can generalize group structures beyond fixed configurations. The work broadens CLIP's applicability to group-level tasks and offers a practical path for robust multi-camera group re-identification in real-world scenarios.
Abstract
Group Re-Identification (Group ReID) aims matching groups of pedestrians across non-overlapping cameras. Unlike single-person ReID, Group ReID focuses more on the changes in group structure, emphasizing the number of members and their spatial arrangement. However, most methods rely on certainty-based models, which consider only the specific group structures in the group images, often failing to match unseen group configurations. To this end, we propose a novel Group-CLIP UncertaintyModeling (GCUM) approach that adapts group text descriptions to undetermined accommodate member and layout variations. Specifically, we design a Member Variant Simulation (MVS)module that simulates member exclusions using a Bernoulli distribution and a Group Layout Adaptation (GLA) module that generates uncertain group text descriptions with identity-specific tokens. In addition, we design a Group RelationshipConstruction Encoder (GRCE) that uses group features to refine individual features, and employ cross-modal contrastive loss to obtain generalizable knowledge from group text descriptions. It is worth noting that we are the first to employ CLIP to GroupReID, and extensive experiments show that GCUM significantly outperforms state-of-the-art Group ReID methods.
