Cerberus: Attribute-based person re-identification using semantic IDs
Chanho Eom, Geon Lee, Kyunghwan Cho, Hyeonseok Jung, Moonsub Jin, Bumsub Ham
TL;DR
Cerberus tackles attribute-based person reID by learning multiple partial representations aligned to semantic IDs (SIDs) derived from grouped attributes. It introduces a semantic guidance loss $\mathcal{L}_{sem}$ to pull same-SID representations toward SID prototypes and a regularization term $\mathcal{L}_{reg}$ to infer prototypes for unseen SIDs, enabling robust zero-shot generalization. The model achieves state-of-the-art results on Market-1501 and DukeMTMC-reID for attributebased reID and delivers competitive PAR and APS performance using a single unified framework. This approach yields a practical, interpretable visual-semantic embedding that supports reID, PAR, and APS without task-specific fine-tuning, offering scalable deployment for attribute-driven surveillance tasks.
Abstract
We introduce a new framework, dubbed Cerberus, for attribute-based person re-identification (reID). Our approach leverages person attribute labels to learn local and global person representations that encode specific traits, such as gender and clothing style. To achieve this, we define semantic IDs (SIDs) by combining attribute labels, and use a semantic guidance loss to align the person representations with the prototypical features of corresponding SIDs, encouraging the representations to encode the relevant semantics. Simultaneously, we enforce the representations of the same person to be embedded closely, enabling recognizing subtle differences in appearance to discriminate persons sharing the same attribute labels. To increase the generalization ability on unseen data, we also propose a regularization method that takes advantage of the relationships between SID prototypes. Our framework performs individual comparisons of local and global person representations between query and gallery images for attribute-based reID. By exploiting the SID prototypes aligned with the corresponding representations, it can also perform person attribute recognition (PAR) and attribute-based person search (APS) without bells and whistles. Experimental results on standard benchmarks on attribute-based person reID, Market-1501 and DukeMTMC, demonstrate the superiority of our model compared to the state of the art.
