Table of Contents
Fetching ...

Sign-Guided Bipartite Graph Hashing for Hamming Space Search

Xueyi Wu

TL;DR

The paper tackles Top-K retrieval in the Hamming space by analyzing and improving bipartite graph hashing. It introduces LightGCH as a lightweight baseline and then proposes SGBGH, a sign-guided framework combining sign-guided negative sampling and sign-aware contrastive learning to address shallow-layer neighbor distinction and deep-layer embedding uniformity. The approach achieves substantial gains over state-of-the-art end-to-end hashing and quantization-based methods across multiple real-world datasets, while also reducing training time and maintaining efficient inference with mixed-precision embeddings. Overall, the work highlights the importance of sign properties and layer-wise propagation in learning effective hash embeddings for scalable Hamming-space search.

Abstract

Bipartite graph hashing (BGH) is extensively used for Top-K search in Hamming space at low storage and inference costs. Recent research adopts graph convolutional hashing for BGH and has achieved the state-of-the-art performance. However, the contributions of its various influencing factors to hashing performance have not been explored in-depth, including the same/different sign count between two binary embeddings during Hamming space search (sign property), the contribution of sub-embeddings at each layer (model property), the contribution of different node types in the bipartite graph (node property), and the combination of augmentation methods. In this work, we build a lightweight graph convolutional hashing model named LightGCH by mainly removing the augmentation methods of the state-of-the-art model BGCH. By analyzing the contributions of each layer and node type to performance, as well as analyzing the Hamming similarity statistics at each layer, we find that the actual neighbors in the bipartite graph tend to have low Hamming similarity at the shallow layer, and all nodes tend to have high Hamming similarity at the deep layers in LightGCH. To tackle these problems, we propose a novel sign-guided framework SGBGH to make improvement, which uses sign-guided negative sampling to improve the Hamming similarity of neighbors, and uses sign-aware contrastive learning to help nodes learn more uniform representations. Experimental results show that SGBGH outperforms BGCH and LightGCH significantly in embedding quality.

Sign-Guided Bipartite Graph Hashing for Hamming Space Search

TL;DR

The paper tackles Top-K retrieval in the Hamming space by analyzing and improving bipartite graph hashing. It introduces LightGCH as a lightweight baseline and then proposes SGBGH, a sign-guided framework combining sign-guided negative sampling and sign-aware contrastive learning to address shallow-layer neighbor distinction and deep-layer embedding uniformity. The approach achieves substantial gains over state-of-the-art end-to-end hashing and quantization-based methods across multiple real-world datasets, while also reducing training time and maintaining efficient inference with mixed-precision embeddings. Overall, the work highlights the importance of sign properties and layer-wise propagation in learning effective hash embeddings for scalable Hamming-space search.

Abstract

Bipartite graph hashing (BGH) is extensively used for Top-K search in Hamming space at low storage and inference costs. Recent research adopts graph convolutional hashing for BGH and has achieved the state-of-the-art performance. However, the contributions of its various influencing factors to hashing performance have not been explored in-depth, including the same/different sign count between two binary embeddings during Hamming space search (sign property), the contribution of sub-embeddings at each layer (model property), the contribution of different node types in the bipartite graph (node property), and the combination of augmentation methods. In this work, we build a lightweight graph convolutional hashing model named LightGCH by mainly removing the augmentation methods of the state-of-the-art model BGCH. By analyzing the contributions of each layer and node type to performance, as well as analyzing the Hamming similarity statistics at each layer, we find that the actual neighbors in the bipartite graph tend to have low Hamming similarity at the shallow layer, and all nodes tend to have high Hamming similarity at the deep layers in LightGCH. To tackle these problems, we propose a novel sign-guided framework SGBGH to make improvement, which uses sign-guided negative sampling to improve the Hamming similarity of neighbors, and uses sign-aware contrastive learning to help nodes learn more uniform representations. Experimental results show that SGBGH outperforms BGCH and LightGCH significantly in embedding quality.
Paper Structure (34 sections, 14 equations, 9 figures, 2 tables)

This paper contains 34 sections, 14 equations, 9 figures, 2 tables.

Figures (9)

  • Figure 1: Performance comparison of mixed-precision embedding and binary embedding of different hashing methods. SGBGH and LightGCH are both our proposed methods.
  • Figure 2: Performance contribution of each node type at each layer in LightGCH on Gowalla dataset. $\mathcal{B}_U$, $\mathcal{B}_V$, $\mathcal{B}_{U,V}$ denote binarizing embeddings of nodes in $U$, $V$, $U \cup V$ per layer.
  • Figure 3: Floating-point embedding, binary embedding and mixed-precision embedding.
  • Figure 4: Layer-wise Hamming similarity statistics of ground-truth neighbors in LightGCH.
  • Figure 5: Layer-wise Hamming similarity statistics of non-neighbors in LightGCH.
  • ...and 4 more figures

Theorems & Definitions (1)

  • Definition 1: Hamming Similarity