High-level Codes and Fine-grained Weights for Online Multi-modal Hashing Retrieval
Yu-Wei Zhan, Xiao-Ming Wu, Xin Luo, Yinwei Wei, Xin-Shun Xu
TL;DR
This work addresses online multi-modal hashing under streaming and category-incremental conditions by introducing High-level Codes, Fine-grained Weights (HCFW). The approach separates hash-code learning (via category-level high-level codes derived from semantic embeddings) from hash-function learning (with per-modality linear projections) and augments fusion with per-instance weights to exploit complementary modalities. Key contributions include (i) a semantic-driven high-level code generation mechanism that maintains long-term code consistency across rounds, and (ii) a fine-grained weighting scheme that enables instance-level modality fusion and improves retrieval accuracy. Empirical results on MIRFlickr and NUS-WIDE demonstrate that HCFW achieves state-of-the-art MAP under standard online and category-incremental settings, while maintaining favorable convergence and runtime characteristics; the authors also provide ablations and analysis to validate the components and offer code upon release.
Abstract
In the real world, multi-modal data often appears in a streaming fashion, and there is a growing demand for similarity retrieval from such non-stationary data, especially at a large scale. In response to this need, online multi-modal hashing has gained significant attention. However, existing online multi-modal hashing methods face challenges related to the inconsistency of hash codes during long-term learning and inefficient fusion of different modalities. In this paper, we present a novel approach to supervised online multi-modal hashing, called High-level Codes, Fine-grained Weights (HCFW). To address these problems, HCFW is designed by its non-trivial contributions from two primary dimensions: 1) Online Hashing Perspective. To ensure the long-term consistency of hash codes, especially in incremental learning scenarios, HCFW learns high-level codes derived from category-level semantics. Besides, these codes are adept at handling the category-incremental challenge. 2) Multi-modal Hashing Aspect. HCFW introduces the concept of fine-grained weights designed to facilitate the seamless fusion of complementary multi-modal data, thereby generating multi-modal weights at the instance level and enhancing the overall hashing performance. A comprehensive battery of experiments conducted on two benchmark datasets convincingly underscores the effectiveness and efficiency of HCFW.
