Table of Contents
Fetching ...

Learning Compatible Multi-Prize Subnetworks for Asymmetric Retrieval

Yushuai Sun, Zikun Zhou, Dongmei Jiang, Yaowei Wang, Jun Yu, Guangming Lu, Wenjie Pei

TL;DR

The paper tackles asymmetric retrieval by enabling a single dense network to produce multiple compatible subnetworks for devices with varying resources, without retraining. It introduces PrunNet, a learnable-score network that supports greedy post-training pruning to obtain multi-prize subnetworks of capacities $c_i$, and a conflict-aware gradient integration to harmonize diverse compatible losses. Empirical results on GLDv2, Paris/Roxford, In-shop, and VeRi-776 show state-of-the-art self- and cross-test performance across diverse backbones, and the approach generalizes to new capacities. The method reduces training overhead while expanding platform compatibility, with future work exploring structured pruning for hardware efficiency.

Abstract

Asymmetric retrieval is a typical scenario in real-world retrieval systems, where compatible models of varying capacities are deployed on platforms with different resource configurations. Existing methods generally train pre-defined networks or subnetworks with capacities specifically designed for pre-determined platforms, using compatible learning. Nevertheless, these methods suffer from limited flexibility for multi-platform deployment. For example, when introducing a new platform into the retrieval systems, developers have to train an additional model at an appropriate capacity that is compatible with existing models via backward-compatible learning. In this paper, we propose a Prunable Network with self-compatibility, which allows developers to generate compatible subnetworks at any desired capacity through post-training pruning. Thus it allows the creation of a sparse subnetwork matching the resources of the new platform without additional training. Specifically, we optimize both the architecture and weight of subnetworks at different capacities within a dense network in compatible learning. We also design a conflict-aware gradient integration scheme to handle the gradient conflicts between the dense network and subnetworks during compatible learning. Extensive experiments on diverse benchmarks and visual backbones demonstrate the effectiveness of our method. Our code and model are available at https://github.com/Bunny-Black/PrunNet.

Learning Compatible Multi-Prize Subnetworks for Asymmetric Retrieval

TL;DR

The paper tackles asymmetric retrieval by enabling a single dense network to produce multiple compatible subnetworks for devices with varying resources, without retraining. It introduces PrunNet, a learnable-score network that supports greedy post-training pruning to obtain multi-prize subnetworks of capacities , and a conflict-aware gradient integration to harmonize diverse compatible losses. Empirical results on GLDv2, Paris/Roxford, In-shop, and VeRi-776 show state-of-the-art self- and cross-test performance across diverse backbones, and the approach generalizes to new capacities. The method reduces training overhead while expanding platform compatibility, with future work exploring structured pruning for hardware efficiency.

Abstract

Asymmetric retrieval is a typical scenario in real-world retrieval systems, where compatible models of varying capacities are deployed on platforms with different resource configurations. Existing methods generally train pre-defined networks or subnetworks with capacities specifically designed for pre-determined platforms, using compatible learning. Nevertheless, these methods suffer from limited flexibility for multi-platform deployment. For example, when introducing a new platform into the retrieval systems, developers have to train an additional model at an appropriate capacity that is compatible with existing models via backward-compatible learning. In this paper, we propose a Prunable Network with self-compatibility, which allows developers to generate compatible subnetworks at any desired capacity through post-training pruning. Thus it allows the creation of a sparse subnetwork matching the resources of the new platform without additional training. Specifically, we optimize both the architecture and weight of subnetworks at different capacities within a dense network in compatible learning. We also design a conflict-aware gradient integration scheme to handle the gradient conflicts between the dense network and subnetworks during compatible learning. Extensive experiments on diverse benchmarks and visual backbones demonstrate the effectiveness of our method. Our code and model are available at https://github.com/Bunny-Black/PrunNet.

Paper Structure

This paper contains 27 sections, 16 equations, 11 figures, 14 tables, 1 algorithm.

Figures (11)

  • Figure 1: Two pipelines for learning compatible models with different capacities for multi-platform deployment. (a) Existing methods tailor $N$ pre-defined (sub)networks for pre-determined platforms through compatible learning and train additional models for new platforms by Backward-Compatible Learning (BCL). (b) Our method constructs a prunable network that can generate compatible subnetworks at any specified capacity via pruning.
  • Figure 2: Comparisons between one-shot and iterative pruning with edge-popup What's_hidden. The plots show the mean results on 5 random initializations. Shading areas denote the standard deviation.
  • Figure 3: Overall pipeline for constructing and optimizing a Prunable Network (PrunNet). Each connection in PrunNet is characterized by a weight $w^l_{ij}$ and a score $s^{l}_{ij}$. The subnetworks are generated by greedy pruning according to the scores. After calculating the gradient of the losses $\{\mathcal{L}_0, \mathcal{L}_1,...,\mathcal{L}_N\}$, we use conflict-aware gradient integration to obtain the gradient $\tilde{\bm g}$ updating the parameters of PrunNet.
  • Figure 4: (a) The performance of our method at new capacities. (b) The number of conflicting gradient pairs in the first convolutional layer of PrunNet. ResNet-18 is used as the backbone.
  • Figure 5: The gradient magnitudes of a convolutional kernel in SwitchNet and PrunNet when optimizing them with our losses. The gradient magnitudes of PrunNet exhibit consistency across different losses along with the training progress.
  • ...and 6 more figures