Table of Contents
Fetching ...

AdapNet: Adaptive Noise-Based Network for Low-Quality Image Retrieval

Sihe Zhang, Qingdong He, Jinlong Peng, Yuxi Li, Zhengkai Jiang, Jiafu Wu, Mingmin Chi, Yabiao Wang, Chengjie Wang

TL;DR

AdapNet tackles the challenge of retrieving visually similar images when query images are degraded by noise. It introduces a quality compensation block to learn known noise characteristics and an adaptive NoiRetrieval Loss that reweights gradients to emphasize learning from unknown noise, improving robustness without sacrificing performance on high-quality data. The approach is validated on newly constructed noisy benchmarks (Noise ROxf and Noise RPar) and outperforms state-of-the-art methods in noisy settings while remaining competitive on clean data. The work provides a practical, scalable solution for robust image retrieval in real-world scenarios with imperfect queries and offers code and datasets for future research.

Abstract

Image retrieval aims to identify visually similar images within a database using a given query image. Traditional methods typically employ both global and local features extracted from images for matching, and may also apply re-ranking techniques to enhance accuracy. However, these methods often fail to account for the noise present in query images, which can stem from natural or human-induced factors, thereby negatively impacting retrieval performance. To mitigate this issue, we introduce a novel setting for low-quality image retrieval, and propose an Adaptive Noise-Based Network (AdapNet) to learn robust abstract representations. Specifically, we devise a quality compensation block trained to compensate for various low-quality factors in input images. Besides, we introduce an innovative adaptive noise-based loss function, which dynamically adjusts its focus on the gradient in accordance with image quality, thereby augmenting the learning of unknown noisy samples during training and enhancing intra-class compactness. To assess the performance, we construct two datasets with low-quality queries, which is built by applying various types of noise on clean query images on the standard Revisited Oxford and Revisited Paris datasets. Comprehensive experimental results illustrate that AdapNet surpasses state-of-the-art methods on the Noise Revisited Oxford and Noise Revisited Paris benchmarks, while maintaining competitive performance on high-quality datasets. The code and constructed datasets will be made available.

AdapNet: Adaptive Noise-Based Network for Low-Quality Image Retrieval

TL;DR

AdapNet tackles the challenge of retrieving visually similar images when query images are degraded by noise. It introduces a quality compensation block to learn known noise characteristics and an adaptive NoiRetrieval Loss that reweights gradients to emphasize learning from unknown noise, improving robustness without sacrificing performance on high-quality data. The approach is validated on newly constructed noisy benchmarks (Noise ROxf and Noise RPar) and outperforms state-of-the-art methods in noisy settings while remaining competitive on clean data. The work provides a practical, scalable solution for robust image retrieval in real-world scenarios with imperfect queries and offers code and datasets for future research.

Abstract

Image retrieval aims to identify visually similar images within a database using a given query image. Traditional methods typically employ both global and local features extracted from images for matching, and may also apply re-ranking techniques to enhance accuracy. However, these methods often fail to account for the noise present in query images, which can stem from natural or human-induced factors, thereby negatively impacting retrieval performance. To mitigate this issue, we introduce a novel setting for low-quality image retrieval, and propose an Adaptive Noise-Based Network (AdapNet) to learn robust abstract representations. Specifically, we devise a quality compensation block trained to compensate for various low-quality factors in input images. Besides, we introduce an innovative adaptive noise-based loss function, which dynamically adjusts its focus on the gradient in accordance with image quality, thereby augmenting the learning of unknown noisy samples during training and enhancing intra-class compactness. To assess the performance, we construct two datasets with low-quality queries, which is built by applying various types of noise on clean query images on the standard Revisited Oxford and Revisited Paris datasets. Comprehensive experimental results illustrate that AdapNet surpasses state-of-the-art methods on the Noise Revisited Oxford and Noise Revisited Paris benchmarks, while maintaining competitive performance on high-quality datasets. The code and constructed datasets will be made available.
Paper Structure (22 sections, 11 equations, 4 figures, 4 tables)

This paper contains 22 sections, 11 equations, 4 figures, 4 tables.

Figures (4)

  • Figure 1: Overview of our proposed method. The current high-quality image input $x_{h}^{i}$ and low-quality image input $x_{l}^{i}$ are fed into the backbone to generate corresponding embeddings. Quality compensation features in multiple colors are employed to learn known noise, and NoiRetrieval Loss is utilized to dynamically adjust the gradient of unknown noise according to image quality.
  • Figure 2: The detail structure of our proposed Adaptive Noise-Based Network (AdapNet). Our network is organized into three components: Backbone, Quality Compensation Block (QCB), and Noise Gradient Bias (NGB). The Backbone extracts the local embedding ${f}_{local}$ and global embedding ${f}_{global}$ of input images. The QCB learns the compensation features from the extracted global features of the input image pairs and integrates them with the low-quality features to form ${f}_{new}$. The NGB adjusts the allocation of gradients to the final features ${f}_{all}$ based on image quality $||z_{i}||$, prioritizing the learning of low-quality images.
  • Figure 3: Here we illustrate NoiRetrieval Loss, AdaFace Loss, and their corresponding gradient scaling terms within the feature space. The arc in the feature space represents the angular relationship between a sample and the ground truth class weight vector, $W_{{y}_i}$, as well as the negative class weight vector $W_{j}$. A well-classified sample will be proximate, in terms of angle, to the ground truth class weight vector, $W_{{y}_i}$. Conversely, a misclassified sample will be closer to $W_{j}$. The color within the arc represents the magnitude of the gradient scaling term g. Samples located in the dark red region will contribute more significantly to the learning process.
  • Figure 4: The demonstrations of the top retrieved results (ranks 6-10) are shown. The image on the left, used as a query image, is generated by cropping only the part bounded by a white box. On the right, we present the results of CFCD zhu2023coarse and SENet lee2023revisiting AdaFace kim2022adaface, and our method, displayed from top to bottom. Images enclosed in green and red boxes denote positive and negative images, respectively.