Table of Contents
Fetching ...

Metric Compatible Training for Online Backfilling in Large-Scale Retrieval

Seonguk Seo, Mustafa Gokhan Uzunbas, Bohyung Han, Sara Cao, Ser-Nam Lim

TL;DR

The paper addresses the high cost and downtime of re-indexing gallery embeddings after large-scale image retrieval model upgrades. It introduces online backfilling via distance rank merge, a reverse query transform for single-pass inference, and metric-compatible contrastive learning to calibrate distances across old and new embedding spaces, with an optional learnable new embedding. The approach achieves monotonic performance gains during backfilling and preserves or surpasses final offline performance across four benchmarks, outperforming prior backward-compatible and online backfilling methods. The work offers a practical, low-overhead solution for rapid, high-quality model upgrades in real-world retrieval systems, with strong robustness to homogeneous, heterogeneous, and open-class settings.

Abstract

Backfilling is the process of re-extracting all gallery embeddings from upgraded models in image retrieval systems. It inevitably requires a prohibitively large amount of computational cost and even entails the downtime of the service. Although backward-compatible learning sidesteps this challenge by tackling query-side representations, this leads to suboptimal solutions in principle because gallery embeddings cannot benefit from model upgrades. We address this dilemma by introducing an online backfilling algorithm, which enables us to achieve a progressive performance improvement during the backfilling process while not sacrificing the final performance of new model after the completion of backfilling. To this end, we first propose a simple distance rank merge technique for online backfilling. Then, we incorporate a reverse transformation module for more effective and efficient merging, which is further enhanced by adopting a metric-compatible contrastive learning approach. These two components help to make the distances of old and new models compatible, resulting in desirable merge results during backfilling with no extra computational overhead. Extensive experiments show the effectiveness of our framework on four standard benchmarks in various settings.

Metric Compatible Training for Online Backfilling in Large-Scale Retrieval

TL;DR

The paper addresses the high cost and downtime of re-indexing gallery embeddings after large-scale image retrieval model upgrades. It introduces online backfilling via distance rank merge, a reverse query transform for single-pass inference, and metric-compatible contrastive learning to calibrate distances across old and new embedding spaces, with an optional learnable new embedding. The approach achieves monotonic performance gains during backfilling and preserves or surpasses final offline performance across four benchmarks, outperforming prior backward-compatible and online backfilling methods. The work offers a practical, low-overhead solution for rapid, high-quality model upgrades in real-world retrieval systems, with strong robustness to homogeneous, heterogeneous, and open-class settings.

Abstract

Backfilling is the process of re-extracting all gallery embeddings from upgraded models in image retrieval systems. It inevitably requires a prohibitively large amount of computational cost and even entails the downtime of the service. Although backward-compatible learning sidesteps this challenge by tackling query-side representations, this leads to suboptimal solutions in principle because gallery embeddings cannot benefit from model upgrades. We address this dilemma by introducing an online backfilling algorithm, which enables us to achieve a progressive performance improvement during the backfilling process while not sacrificing the final performance of new model after the completion of backfilling. To this end, we first propose a simple distance rank merge technique for online backfilling. Then, we incorporate a reverse transformation module for more effective and efficient merging, which is further enhanced by adopting a metric-compatible contrastive learning approach. These two components help to make the distances of old and new models compatible, resulting in desirable merge results during backfilling with no extra computational overhead. Extensive experiments show the effectiveness of our framework on four standard benchmarks in various settings.
Paper Structure (38 sections, 11 equations, 11 figures, 2 tables)

This paper contains 38 sections, 11 equations, 11 figures, 2 tables.

Figures (11)

  • Figure 1: Image retrieval with the proposed distance rank merge technique. In the middle of backfilling, we retrieve images independently using two separate models and their galleries, and merge the retrieval results based on their distances. Note that the total number of gallery embeddings are fixed throughout the backfilling process, i.e., $|\mathbf{G}| = |\mathbf{G}^\text{new}|+|\mathbf{G}^\text{old}|$.
  • Figure 2: mAP and CMC results on the standard benchmarks using ResNet-18. Old and New denote the performance without backfilling and with offline backfilling, respectively. The distance rank merge of the old and new models, denoted by Merge, exhibits desirable results; the accuracy monotonically increases as backfill progresses without negative flips for all datasets and the rank merge with online backfilling achieves competitive final performances of offline backfilling. The numbers in the legend indicate either AUC$_\text{mAP}$ or AUC$_\text{CMC}$ scores.
  • Figure 3: Image retrieval with our final rank merge framework including Section \ref{['sub:formulation']}-\ref{['sub:learnnew']}. Backward retrieval system consists of reversely transformed new query and old gallery, $\{ \rho^\text{rev}, \phi^\text{old} \}$. The final image retrieval results are given by merging the outputs from $\{ \rho^\text{rev}, \phi^\text{old} \}$ and $\{ \rho^\text{new}, \rho^\text{new} \}$.
  • Figure 4: Illustration of metric compatible contrastive learning loss with backward retrieval system $\{ \phi^\text{old}, \phi^\text{rev} \}$ and new retrieval system $\{ \phi^\text{new}, \phi^\text{new} \}$. Two boxes with dotted lines corresponds to two terms in \ref{['eq:cmcl']}. For each retrieval system, the distances between positive pairs are learned to be both smaller than those of negative pairs in the two systems.
  • Figure 5: Compatible training with learnable new embedding, where another transformation module $\rho(\cdot)$ is incorporated on top of the new model to learn new embedding favorable to our rank merging. The retrieval results are now merged from $\{ \rho^\text{rev}, \phi^\text{old} \}$ and $\{ \rho^\text{new}, \rho^\text{new} \}$.
  • ...and 6 more figures