Table of Contents
Fetching ...

Optimizing Rank-based Metrics with Blackbox Differentiation

Michal Rolínek, Vít Musil, Anselm Paulus, Marin Vlastelica, Claudio Michaelis, Georg Martius

TL;DR

This paper tackles the challenge of directly optimizing non-differentiable rank-based metrics like AP and Recall in vision tasks. It introduces RaMBO, a blackbox differentiation framework that treats ranking as a combinatorial solver and backpropagates through a differentiable interpolation with controlled fidelity via a parameter $\lambda$. The authors design sound loss functions (recall-based and AP-based) and address practical issues such as mini-batch bias, ties, and sparse supervision through margin shifts, score memory, and refined recall signals. Empirical results on image retrieval and object detection show competitive retrieval performance and consistent improvements on near-state-of-the-art detectors, while maintaining computational efficiency suitable for long sequences. The work provides a principled, implementable path for integrating rank-based optimization into standard CV pipelines, with public code to foster adoption and extension.

Abstract

Rank-based metrics are some of the most widely used criteria for performance evaluation of computer vision models. Despite years of effort, direct optimization for these metrics remains a challenge due to their non-differentiable and non-decomposable nature. We present an efficient, theoretically sound, and general method for differentiating rank-based metrics with mini-batch gradient descent. In addition, we address optimization instability and sparsity of the supervision signal that both arise from using rank-based metrics as optimization targets. Resulting losses based on recall and Average Precision are applied to image retrieval and object detection tasks. We obtain performance that is competitive with state-of-the-art on standard image retrieval datasets and consistently improve performance of near state-of-the-art object detectors. The code is available at https://github.com/martius-lab/blackbox-backprop

Optimizing Rank-based Metrics with Blackbox Differentiation

TL;DR

This paper tackles the challenge of directly optimizing non-differentiable rank-based metrics like AP and Recall in vision tasks. It introduces RaMBO, a blackbox differentiation framework that treats ranking as a combinatorial solver and backpropagates through a differentiable interpolation with controlled fidelity via a parameter . The authors design sound loss functions (recall-based and AP-based) and address practical issues such as mini-batch bias, ties, and sparse supervision through margin shifts, score memory, and refined recall signals. Empirical results on image retrieval and object detection show competitive retrieval performance and consistent improvements on near-state-of-the-art detectors, while maintaining computational efficiency suitable for long sequences. The work provides a principled, implementable path for integrating rank-based optimization into standard CV pipelines, with public code to foster adoption and extension.

Abstract

Rank-based metrics are some of the most widely used criteria for performance evaluation of computer vision models. Despite years of effort, direct optimization for these metrics remains a challenge due to their non-differentiable and non-decomposable nature. We present an efficient, theoretically sound, and general method for differentiating rank-based metrics with mini-batch gradient descent. In addition, we address optimization instability and sparsity of the supervision signal that both arise from using rank-based metrics as optimization targets. Resulting losses based on recall and Average Precision are applied to image retrieval and object detection tasks. We obtain performance that is competitive with state-of-the-art on standard image retrieval datasets and consistently improve performance of near state-of-the-art object detectors. The code is available at https://github.com/martius-lab/blackbox-backprop

Paper Structure

This paper contains 36 sections, 4 theorems, 35 equations, 6 figures, 9 tables, 1 algorithm.

Key Result

Proposition 1

In the notation set by Eqs. eq:rank-def and eq:rank-def2, we have

Figures (6)

  • Figure 1: Differentiation of a piecewise constant rank-based loss. A two-dimensional section of the loss landscape is shown (left) along with two efficiently differentiable interpolations of increasing strengths (middle and right).
  • Figure 2: Mini-batch estimation of mean Average Precision. The expected $\mathop{\mathrm{\mathit{mAP}}}\nolimits$ (i.e. the optimized loss) is an overly optimistic estimator of the true $\mathop{\mathrm{\mathit{mAP}}}\nolimits$ over the dataset; particularly for small batch sizes. The mean and standard deviations over sampled mini-batch estimates are displayed.
  • Figure 3: Naive rank-based losses can collapse during optimization. Shifting the scores during training induces a margin and a suitable scale for the scores. Red lines indicate negative scores and green positive scores.
  • Figure 4: Stanford Online Products image retrieval examples.
  • Figure 5: Evolution of the ranking-surrogate landscapes with respect to their parameters.
  • ...and 1 more figures

Theorems & Definitions (8)

  • Proposition 1
  • Theorem 1: Rearrangement inequality
  • proof : Proof of Proposition \ref{['prop:rank']}
  • Lemma 1
  • Proposition 2
  • proof
  • proof : proof of Lemma \ref{['lem:coarea']}
  • proof : Proof of \ref{['eq:lrec-log']}