Table of Contents
Fetching ...

Mutual Learning for Hashing: Unlocking Strong Hash Functions from Weak Supervision

Xiaoxu Ma, Runhao Li, Zhenyu Weng

TL;DR

MLH addresses the gap between global semantic modeling and local similarity preservation in deep hashing by coupling a strong center-based branch with a weaker pairwise-based branch through mutual learning. It introduces a hashing-focused Mixture-of-Hash-Experts module to enable effective cross-branch interaction while keeping a shared backbone, optimizing three losses: center-based $L_C$, pairwise $L_P$, and cosine-based mutual loss $L_M$. The approach yields consistent mAP gains across CIFAR-10, ImageNet, and MSCOCO, outperforming state-of-the-art methods by up to roughly 1–2 percentage points across 16/32/64-bit codes. This work demonstrates the practical value of combining global and local supervision signals via mutual learning and expert sharing for scalable image retrieval tasks.

Abstract

Deep hashing has been widely adopted for large-scale image retrieval, with numerous strategies proposed to optimize hash function learning. Pairwise-based methods are effective in learning hash functions that preserve local similarity relationships, whereas center-based methods typically achieve superior performance by more effectively capturing global data distributions. However, the strength of center-based methods in modeling global structures often comes at the expense of underutilizing important local similarity information. To address this limitation, we propose Mutual Learning for Hashing (MLH), a novel weak-to-strong framework that enhances a center-based hashing branch by transferring knowledge from a weaker pairwise-based branch. MLH consists of two branches: a strong center-based branch and a weaker pairwise-based branch. Through an iterative mutual learning process, the center-based branch leverages local similarity cues learned by the pairwise-based branch. Furthermore, inspired by the mixture-of-experts paradigm, we introduce a novel mixture-of-hash-experts module that enables effective cross-branch interaction, further enhancing the performance of both branches. Extensive experiments demonstrate that MLH consistently outperforms state-of-the-art hashing methods across multiple benchmark datasets.

Mutual Learning for Hashing: Unlocking Strong Hash Functions from Weak Supervision

TL;DR

MLH addresses the gap between global semantic modeling and local similarity preservation in deep hashing by coupling a strong center-based branch with a weaker pairwise-based branch through mutual learning. It introduces a hashing-focused Mixture-of-Hash-Experts module to enable effective cross-branch interaction while keeping a shared backbone, optimizing three losses: center-based , pairwise , and cosine-based mutual loss . The approach yields consistent mAP gains across CIFAR-10, ImageNet, and MSCOCO, outperforming state-of-the-art methods by up to roughly 1–2 percentage points across 16/32/64-bit codes. This work demonstrates the practical value of combining global and local supervision signals via mutual learning and expert sharing for scalable image retrieval tasks.

Abstract

Deep hashing has been widely adopted for large-scale image retrieval, with numerous strategies proposed to optimize hash function learning. Pairwise-based methods are effective in learning hash functions that preserve local similarity relationships, whereas center-based methods typically achieve superior performance by more effectively capturing global data distributions. However, the strength of center-based methods in modeling global structures often comes at the expense of underutilizing important local similarity information. To address this limitation, we propose Mutual Learning for Hashing (MLH), a novel weak-to-strong framework that enhances a center-based hashing branch by transferring knowledge from a weaker pairwise-based branch. MLH consists of two branches: a strong center-based branch and a weaker pairwise-based branch. Through an iterative mutual learning process, the center-based branch leverages local similarity cues learned by the pairwise-based branch. Furthermore, inspired by the mixture-of-experts paradigm, we introduce a novel mixture-of-hash-experts module that enables effective cross-branch interaction, further enhancing the performance of both branches. Extensive experiments demonstrate that MLH consistently outperforms state-of-the-art hashing methods across multiple benchmark datasets.

Paper Structure

This paper contains 25 sections, 12 equations, 7 figures, 5 tables, 1 algorithm.

Figures (7)

  • Figure 1: Comparison of hash code learning strategies. (a) Pairwise supervision captures local similarity but lacks global structure. (b) Center-based supervision emphasizes global semantics while ignoring local relations. (c) The proposed MLH integrates both via dual hash layers and deep mutual learning. Cyan, orange, and red arrows indicate center-based, mutual, and pairwise loss, respectively.
  • Figure 2: Overview of the proposed Mutual Learning for Hashing (MLH) framework. A deep neural network extracts image features, which are then passed through two parallel branches: a pairwise-supervised weak branch and a center-supervised strong branch. Each branch generates hash codes via its own hash layer, enabling mutual learning between local and global similarity structures.
  • Figure 3: Comparison of expert-based hashing architectures. (a) Traditional mixture-of-experts (MoE) used in dual-branch tasks with separate expert modules. (b) The proposed Mixture of Hashing Experts (MoH), featuring shared experts and independent gates, where each expert generates continuous hash codes. (c) A MoH variant without expert sharing, maintaining the expert-to-hash mapping.
  • Figure 4: Precision recall curves on ImageNet across different bit configurations.
  • Figure 5: Impact of MLH on hash code distribution for ImageNet100 with 16-bit configurations. (a) Employs only center-based supervision, focusing on global data structure. (b) Utilizes solely pairwise-based supervision, emphasizing local similarity relationships. (c) Integrates mutual learning supervision, enabling the weak pairwise branch to fine-tune the strong center-based branch for improved hash code optimization.
  • ...and 2 more figures