Table of Contents
Fetching ...

KALAHash: Knowledge-Anchored Low-Resource Adaptation for Deep Hashing

Shu Zhao, Tan Yu, Xiaoshuai Hao, Wenchao Ma, Vijaykrishnan Narayanan

TL;DR

This work addresses the challenge of adapting deep hashing models to downstream tasks with very limited data by introducing KALAHash, a knowledge-anchored, parameter-efficient framework. It combines Class-Calibration LoRA (CLoRA), which injects class-level textual knowledge as anchors into low-rank updates, with Knowledge-Guided Discrete Optimization (KIDDO) to align hash codes with textual knowledge via discrete optimization. The approach leverages Vision-Language Models (e.g., CLIP) to extract class semantics and uses a Top_r knowledge-selection mechanism to adapt updates dynamically, achieving substantial improvements in low-resource, multi-dataset hashing tasks and showing strong plug-and-play compatibility with existing methods. Empirically, KALAHash yields significant gains across NUS-WIDE, MS-COCO, and CIFAR-10, scales with backbone size, and maintains low inference overhead, highlighting its practical potential for efficient retrieval in data-scarce domains.

Abstract

Deep hashing has been widely used for large-scale approximate nearest neighbor search due to its storage and search efficiency. However, existing deep hashing methods predominantly rely on abundant training data, leaving the more challenging scenario of low-resource adaptation for deep hashing relatively underexplored. This setting involves adapting pre-trained models to downstream tasks with only an extremely small number of training samples available. Our preliminary benchmarks reveal that current methods suffer significant performance degradation due to the distribution shift caused by limited training samples. To address these challenges, we introduce Class-Calibration LoRA (CLoRA), a novel plug-and-play approach that dynamically constructs low-rank adaptation matrices by leveraging class-level textual knowledge embeddings. CLoRA effectively incorporates prior class knowledge as anchors, enabling parameter-efficient fine-tuning while maintaining the original data distribution. Furthermore, we propose Knowledge-Guided Discrete Optimization (KIDDO), a framework to utilize class knowledge to compensate for the scarcity of visual information and enhance the discriminability of hash codes. Extensive experiments demonstrate that our proposed method, Knowledge- Anchored Low-Resource Adaptation Hashing (KALAHash), significantly boosts retrieval performance and achieves a 4x data efficiency in low-resource scenarios.

KALAHash: Knowledge-Anchored Low-Resource Adaptation for Deep Hashing

TL;DR

This work addresses the challenge of adapting deep hashing models to downstream tasks with very limited data by introducing KALAHash, a knowledge-anchored, parameter-efficient framework. It combines Class-Calibration LoRA (CLoRA), which injects class-level textual knowledge as anchors into low-rank updates, with Knowledge-Guided Discrete Optimization (KIDDO) to align hash codes with textual knowledge via discrete optimization. The approach leverages Vision-Language Models (e.g., CLIP) to extract class semantics and uses a Top_r knowledge-selection mechanism to adapt updates dynamically, achieving substantial improvements in low-resource, multi-dataset hashing tasks and showing strong plug-and-play compatibility with existing methods. Empirically, KALAHash yields significant gains across NUS-WIDE, MS-COCO, and CIFAR-10, scales with backbone size, and maintains low inference overhead, highlighting its practical potential for efficient retrieval in data-scarce domains.

Abstract

Deep hashing has been widely used for large-scale approximate nearest neighbor search due to its storage and search efficiency. However, existing deep hashing methods predominantly rely on abundant training data, leaving the more challenging scenario of low-resource adaptation for deep hashing relatively underexplored. This setting involves adapting pre-trained models to downstream tasks with only an extremely small number of training samples available. Our preliminary benchmarks reveal that current methods suffer significant performance degradation due to the distribution shift caused by limited training samples. To address these challenges, we introduce Class-Calibration LoRA (CLoRA), a novel plug-and-play approach that dynamically constructs low-rank adaptation matrices by leveraging class-level textual knowledge embeddings. CLoRA effectively incorporates prior class knowledge as anchors, enabling parameter-efficient fine-tuning while maintaining the original data distribution. Furthermore, we propose Knowledge-Guided Discrete Optimization (KIDDO), a framework to utilize class knowledge to compensate for the scarcity of visual information and enhance the discriminability of hash codes. Extensive experiments demonstrate that our proposed method, Knowledge- Anchored Low-Resource Adaptation Hashing (KALAHash), significantly boosts retrieval performance and achieves a 4x data efficiency in low-resource scenarios.
Paper Structure (28 sections, 12 equations, 10 figures, 9 tables)

This paper contains 28 sections, 12 equations, 10 figures, 9 tables.

Figures (10)

  • Figure 1: Performance comparison in low-resource settings (1-shot on the CIFAR-10 dataset), including mean Average Precision scores (left) and Silhouette Scores (right). FFT and LB denote Full Fine-Tuning and Lock Backbone, respectively. The increasing mAP and Silhouette Score indicate improved cluster separation and cohesion in the embedding space, demonstrating the effectiveness of our approach in addressing the distribution shift challenge. For the Silhouette Score, we normalize its range from $[-1, +1]$ to $[0, 100]$.
  • Figure 2: Architecture overview of the proposed KALAHash method, illustrating the integration of Class-Calibration LoRA (CLoRA) and Knowledge-Guided Discrete Optimization (KIDDO).
  • Figure 3: Architecture of the proposed CLoRA module.
  • Figure 4: Performance comparison of KALAHash and baseline methods as the number of shots increases from $1$ to $500$ on CIFAR-10 dataset. MDSH-FFT denotes all the parameters are fine-tuned in the MDSH baseline.
  • Figure 5: Performance scaling of KALAHash with different backbone models (SLIP ViT-S, CLIP ViT-B, CLIP ViT-L) in relation to the number of model parameters.
  • ...and 5 more figures