Table of Contents
Fetching ...

COMAE: COMprehensive Attribute Exploration for Zero-shot Hashing

Yuqi Li, Qingqing Long, Yihang Zhou, Ran Zhang, Zhiyuan Ning, Zhihong Zhu, Yuanchun Zhou, Xuezhi Wang, Meng Xiao

TL;DR

COMAE addresses limitations in zero-shot hashing by explicitly modeling locality and continuous attributes through three consistency explorations (point-wise, pair-wise, class-wise) and by employing an attribute prototype network with contrastive context. The framework jointly optimizes image representations, visual attributes, and hash codes, backed by a theoretical lower-bound analysis linking inter- and intra-class distances to hashing performance. Empirical results on AWA2, CUB, and SUN demonstrate state-of-the-art mAP and AUC across multiple code lengths, with pronounced gains as the number of unseen classes grows. The approach also shows superior efficiency and robust behavior under varying unseen-class ratios, highlighting its practicality for large-scale retrieval in open-set settings.

Abstract

Zero-shot hashing (ZSH) has shown excellent success owing to its efficiency and generalization in large-scale retrieval scenarios. While considerable success has been achieved, there still exist urgent limitations. Existing works ignore the locality relationships of representations and attributes, which have effective transferability between seeable classes and unseeable classes. Also, the continuous-value attributes are not fully harnessed. In response, we conduct a COMprehensive Attribute Exploration for ZSH, named COMAE, which depicts the relationships from seen classes to unseen ones through three meticulously designed explorations, i.e., point-wise, pair-wise and class-wise consistency constraints. By regressing attributes from the proposed attribute prototype network, COMAE learns the local features that are relevant to the visual attributes. Then COMAE utilizes contrastive learning to comprehensively depict the context of attributes, rather than instance-independent optimization. Finally, the class-wise constraint is designed to cohesively learn the hash code, image representation, and visual attributes more effectively. Experimental results on the popular ZSH datasets demonstrate that COMAE outperforms state-of-the-art hashing techniques, especially in scenarios with a larger number of unseen label classes.

COMAE: COMprehensive Attribute Exploration for Zero-shot Hashing

TL;DR

COMAE addresses limitations in zero-shot hashing by explicitly modeling locality and continuous attributes through three consistency explorations (point-wise, pair-wise, class-wise) and by employing an attribute prototype network with contrastive context. The framework jointly optimizes image representations, visual attributes, and hash codes, backed by a theoretical lower-bound analysis linking inter- and intra-class distances to hashing performance. Empirical results on AWA2, CUB, and SUN demonstrate state-of-the-art mAP and AUC across multiple code lengths, with pronounced gains as the number of unseen classes grows. The approach also shows superior efficiency and robust behavior under varying unseen-class ratios, highlighting its practicality for large-scale retrieval in open-set settings.

Abstract

Zero-shot hashing (ZSH) has shown excellent success owing to its efficiency and generalization in large-scale retrieval scenarios. While considerable success has been achieved, there still exist urgent limitations. Existing works ignore the locality relationships of representations and attributes, which have effective transferability between seeable classes and unseeable classes. Also, the continuous-value attributes are not fully harnessed. In response, we conduct a COMprehensive Attribute Exploration for ZSH, named COMAE, which depicts the relationships from seen classes to unseen ones through three meticulously designed explorations, i.e., point-wise, pair-wise and class-wise consistency constraints. By regressing attributes from the proposed attribute prototype network, COMAE learns the local features that are relevant to the visual attributes. Then COMAE utilizes contrastive learning to comprehensively depict the context of attributes, rather than instance-independent optimization. Finally, the class-wise constraint is designed to cohesively learn the hash code, image representation, and visual attributes more effectively. Experimental results on the popular ZSH datasets demonstrate that COMAE outperforms state-of-the-art hashing techniques, especially in scenarios with a larger number of unseen label classes.
Paper Structure (23 sections, 2 theorems, 8 equations, 5 figures, 3 tables, 1 algorithm)

This paper contains 23 sections, 2 theorems, 8 equations, 5 figures, 3 tables, 1 algorithm.

Key Result

Theorem 1

zhu2022lower Let $\mathcal{D}_{\mathit{inter}}$ and $\mathcal{D}_{\mathit{intra}}$ denote the inter-class distinctiveness and intra-class compactness, respectively. The lower bound of deep supervised hashing performance is proportional to

Figures (5)

  • Figure 1: The architecture of the proposed COMAE. It consists of three modules: a) Point-wise objective aims at improving image locality and attribute representations; b) Pair-wise loss is proposed to learn the representations from individual learning to context-based learning; and c) Class-wise constraint is designed to capture relationships of attributes and class labels.
  • Figure 2: The comparison of PR Curve, P@N Curve and R@N Curve in 64 bit length codes.
  • Figure 3: mAP and AUC scores with the change of unseen classes ratio in the training process.
  • Figure 4: Histogram distances of the intra-classes and inter-classes. The arrow annotation is the quantitative separability with the hamming distance, $\mathbb{E}[D_{inter}]-\mathbb{E}[D_{intra}]$.
  • Figure 5: Case study on the CUB dataset.

Theorems & Definitions (2)

  • Theorem 1
  • Theorem 2