Table of Contents
Fetching ...

Budget-Aware Pruning: Handling Multiple Domains with Less Parameters

Samuel Felipe dos Santos, Rodrigo Berriel, Thiago Oliveira-Santos, Nicu Sebe, Jurandy Almeida

TL;DR

The paper addresses multi-domain learning under user-defined resource budgets by extending Budget-Aware Adapters to promote cross-domain filter sharing and enable pruning of unused weights at test time. It introduces three parameter-sharing losses—$L_{PS}^{Int.}$, $L_{PS}^{Union}$, and $L_{PS}^{Jaccard}$—and a learnable weight for these losses, along with a new efficiency metric to balance accuracy and resource use. The method trains all domains simultaneously, pruning weights not used by any domain to achieve lower parameters and MACs than the backbone while maintaining competitive performance, demonstrated on the Visual Decathlon and ImageNet-to-Sketch benchmarks. This approach provides a practical path to deploying robust multi-domain models on resource-constrained devices by jointly optimizing accuracy and efficiency through structured parameter sharing and test-time pruning.

Abstract

Deep learning has achieved state-of-the-art performance on several computer vision tasks and domains. Nevertheless, it still has a high computational cost and demands a significant amount of parameters. Such requirements hinder the use in resource-limited environments and demand both software and hardware optimization. Another limitation is that deep models are usually specialized into a single domain or task, requiring them to learn and store new parameters for each new one. Multi-Domain Learning (MDL) attempts to solve this problem by learning a single model capable of performing well in multiple domains. Nevertheless, the models are usually larger than the baseline for a single domain. This work tackles both of these problems: our objective is to prune models capable of handling multiple domains according to a user-defined budget, making them more computationally affordable while keeping a similar classification performance. We achieve this by encouraging all domains to use a similar subset of filters from the baseline model, up to the amount defined by the user's budget. Then, filters that are not used by any domain are pruned from the network. The proposed approach innovates by better adapting to resource-limited devices while being one of the few works that handles multiple domains at test time with fewer parameters and lower computational complexity than the baseline model for a single domain.

Budget-Aware Pruning: Handling Multiple Domains with Less Parameters

TL;DR

The paper addresses multi-domain learning under user-defined resource budgets by extending Budget-Aware Adapters to promote cross-domain filter sharing and enable pruning of unused weights at test time. It introduces three parameter-sharing losses—, , and —and a learnable weight for these losses, along with a new efficiency metric to balance accuracy and resource use. The method trains all domains simultaneously, pruning weights not used by any domain to achieve lower parameters and MACs than the backbone while maintaining competitive performance, demonstrated on the Visual Decathlon and ImageNet-to-Sketch benchmarks. This approach provides a practical path to deploying robust multi-domain models on resource-constrained devices by jointly optimizing accuracy and efficiency through structured parameter sharing and test-time pruning.

Abstract

Deep learning has achieved state-of-the-art performance on several computer vision tasks and domains. Nevertheless, it still has a high computational cost and demands a significant amount of parameters. Such requirements hinder the use in resource-limited environments and demand both software and hardware optimization. Another limitation is that deep models are usually specialized into a single domain or task, requiring them to learn and store new parameters for each new one. Multi-Domain Learning (MDL) attempts to solve this problem by learning a single model capable of performing well in multiple domains. Nevertheless, the models are usually larger than the baseline for a single domain. This work tackles both of these problems: our objective is to prune models capable of handling multiple domains according to a user-defined budget, making them more computationally affordable while keeping a similar classification performance. We achieve this by encouraging all domains to use a similar subset of filters from the baseline model, up to the amount defined by the user's budget. Then, filters that are not used by any domain are pruned from the network. The proposed approach innovates by better adapting to resource-limited devices while being one of the few works that handles multiple domains at test time with fewer parameters and lower computational complexity than the baseline model for a single domain.
Paper Structure (14 sections, 12 equations, 2 figures, 3 tables)

This paper contains 14 sections, 12 equations, 2 figures, 3 tables.

Figures (2)

  • Figure 1: In MDL, a pre-trained model is adapted to handle multiple domains. In standard adapters, the amount of parameters from the domain-specific models (indicated in colored $\mathcal{C}$) is equal to or greater than the backbone model (due to the mask represented in black). Budget-Aware Adapters berriel2019budget reduce the number of parameters required for each domain (unused parameters are denoted in gray). However, the whole model is needed at test time if handling distinct domains (colored areas share few parameters). Our model encourages different domains to use the same parameters (colored areas share most of the parameters). Thus, when handling multi-domains at test time, the unused parameters can be pruned.
  • Figure 2: Overview of our strategy for sharing parameters among domains. Our parameter-sharing loss function is calculated over a combination of the masks from all the domains and is used to encourage the sharing of parameters between them. The parameters that are not used by any domain (white squares) can be pruned, reducing the number of parameters and computational cost of the model. $\odot$ represents the element-wise multiplication between a binary mask of a domain and the kernel weights and $\oplus$ represents the union of the weights used by each domain. Colors represent data (i.e., weights, masks, etc.), therefore, the colored squares denote both the input data for each operation as well as its resulting output.