Table of Contents
Fetching ...

HiHPQ: Hierarchical Hyperbolic Product Quantization for Unsupervised Image Retrieval

Zexuan Qiu, Jiahong Liu, Yankai Chen, Irwin King

TL;DR

This work tackles unsupervised image retrieval by addressing two gaps: preserving multi-level semantic similarities and leveraging non-Euclidean geometry. It introduces HiHPQ, a Hierarchical Hyperbolic Product Quantization framework that embeds data in a Cartesian product of Lorentzian manifolds, uses a differentiable soft hyperbolic codebook, and optimizes with a hyperbolic contrastive objective. A hierarchical semantics learning module extracts pseudo hierarchies via clustering in tangent spaces and enforces both prototype-wise and instance-wise supervision on the hyperbolic embeddings. Empirical results on Flickr25K, NUS-WIDE, and CIFAR-10 demonstrate substantial gains over strong baselines, validating the benefits of combining hyperbolic geometry with hierarchical supervision for unsupervised product quantization.

Abstract

Existing unsupervised deep product quantization methods primarily aim for the increased similarity between different views of the identical image, whereas the delicate multi-level semantic similarities preserved between images are overlooked. Moreover, these methods predominantly focus on the Euclidean space for computational convenience, compromising their ability to map the multi-level semantic relationships between images effectively. To mitigate these shortcomings, we propose a novel unsupervised product quantization method dubbed \textbf{Hi}erarchical \textbf{H}yperbolic \textbf{P}roduct \textbf{Q}uantization (HiHPQ), which learns quantized representations by incorporating hierarchical semantic similarity within hyperbolic geometry. Specifically, we propose a hyperbolic product quantizer, where the hyperbolic codebook attention mechanism and the quantized contrastive learning on the hyperbolic product manifold are introduced to expedite quantization. Furthermore, we propose a hierarchical semantics learning module, designed to enhance the distinction between similar and non-matching images for a query by utilizing the extracted hierarchical semantics as an additional training supervision. Experiments on benchmarks show that our proposed method outperforms state-of-the-art baselines.

HiHPQ: Hierarchical Hyperbolic Product Quantization for Unsupervised Image Retrieval

TL;DR

This work tackles unsupervised image retrieval by addressing two gaps: preserving multi-level semantic similarities and leveraging non-Euclidean geometry. It introduces HiHPQ, a Hierarchical Hyperbolic Product Quantization framework that embeds data in a Cartesian product of Lorentzian manifolds, uses a differentiable soft hyperbolic codebook, and optimizes with a hyperbolic contrastive objective. A hierarchical semantics learning module extracts pseudo hierarchies via clustering in tangent spaces and enforces both prototype-wise and instance-wise supervision on the hyperbolic embeddings. Empirical results on Flickr25K, NUS-WIDE, and CIFAR-10 demonstrate substantial gains over strong baselines, validating the benefits of combining hyperbolic geometry with hierarchical supervision for unsupervised product quantization.

Abstract

Existing unsupervised deep product quantization methods primarily aim for the increased similarity between different views of the identical image, whereas the delicate multi-level semantic similarities preserved between images are overlooked. Moreover, these methods predominantly focus on the Euclidean space for computational convenience, compromising their ability to map the multi-level semantic relationships between images effectively. To mitigate these shortcomings, we propose a novel unsupervised product quantization method dubbed \textbf{Hi}erarchical \textbf{H}yperbolic \textbf{P}roduct \textbf{Q}uantization (HiHPQ), which learns quantized representations by incorporating hierarchical semantic similarity within hyperbolic geometry. Specifically, we propose a hyperbolic product quantizer, where the hyperbolic codebook attention mechanism and the quantized contrastive learning on the hyperbolic product manifold are introduced to expedite quantization. Furthermore, we propose a hierarchical semantics learning module, designed to enhance the distinction between similar and non-matching images for a query by utilizing the extracted hierarchical semantics as an additional training supervision. Experiments on benchmarks show that our proposed method outperforms state-of-the-art baselines.
Paper Structure (33 sections, 23 equations, 4 figures, 3 tables)

This paper contains 33 sections, 23 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Left: Illustration of hierarchical semantics preserved in images. Right: Distance comparison on Euclidean space (top) and hyperbolic space (bottom). The $2$-dimensional hyperbolic space example is depicted using the Lorentz model $\mathcal{L}^2$ in $\mathbb{R}^3$.
  • Figure 2: (a) The architecture of HiHPQ. "ExpMap" denotes the exponential map for short. (b) An example of our hyperbolic product quantizer. In this example, there are two hyperbolic spaces $\mathbb{H}^2_{\theta_1}$ and $\mathbb{H}^2_{\theta_2}$ of 2-dimension depicted by the Lorentz model in $\mathbb{R}^3$, where $\theta_1$ and $\theta_2$ denotes the curvature parameter. On the right side, two codebooks $C^1$ and $C^2$ are displayed using 2D Voronoi diagrams. Subvectors will be quantized by codewords in corresponding codebooks via the hyperbolic distance metric.
  • Figure 3: Illustration of both instance-wise and prototype-wise contrastive learning based on our extracted hierarchy.
  • Figure 4: (a) MAP@1000 with varying numbers of codebooks on CIFAR10 (I); "A $\rightarrow$ B" denotes the performance gain achieved by Model A compared to Model B. (b) 32-bit quantization errors of HiHPQ and its variant on CIFAR10 (II). (c) Effects of the temperature $\tau_{qc}$ on CIFAR-10 (II). (d) Effect of the initial curvature parameters on CIFAR10 (II).