Table of Contents
Fetching ...

EERO: Early Exit with Reject Option for Efficient Classification with limited budget

Florian Valade, Mohamed Hebiri, Paul Gay

TL;DR

EERO addresses the challenge of deploying deep nets under strict computational budgets by reframing early exit as a multi-head reject-option classification problem. It combines an optimal abstention rule with budget-aware aggregation of exit probabilities via exponential weights, calibrating thresholds at each head to respect a budget $B$. The approach offers strong oracle-type guarantees and practical budget control, demonstrated on ResNet-18, ConvNext, CIFAR-100, and ImageNet, and shows competitive or superior accuracy when respecting computational limits. This budget-aware, model-agnostic framework is particularly impactful for edge devices and other resource-constrained settings, enabling efficient inference without architecture redesign.

Abstract

The increasing complexity of advanced machine learning models requires innovative approaches to manage computational resources effectively. One such method is the Early Exit strategy, which allows for adaptive computation by providing a mechanism to shorten the processing path for simpler data instances. In this paper, we propose EERO, a new methodology to translate the problem of early exiting to a problem of using multiple classifiers with reject option in order to better select the exiting head for each instance. We calibrate the probabilities of exiting at the different heads using aggregation with exponential weights to guarantee a fixed budget .We consider factors such as Bayesian risk, budget constraints, and head-specific budget consumption. Experimental results, conducted using a ResNet-18 model and a ConvNext architecture on Cifar and ImageNet datasets, demonstrate that our method not only effectively manages budget allocation but also enhances accuracy in overthinking scenarios.

EERO: Early Exit with Reject Option for Efficient Classification with limited budget

TL;DR

EERO addresses the challenge of deploying deep nets under strict computational budgets by reframing early exit as a multi-head reject-option classification problem. It combines an optimal abstention rule with budget-aware aggregation of exit probabilities via exponential weights, calibrating thresholds at each head to respect a budget . The approach offers strong oracle-type guarantees and practical budget control, demonstrated on ResNet-18, ConvNext, CIFAR-100, and ImageNet, and shows competitive or superior accuracy when respecting computational limits. This budget-aware, model-agnostic framework is particularly impactful for edge devices and other resource-constrained settings, enabling efficient inference without architecture redesign.

Abstract

The increasing complexity of advanced machine learning models requires innovative approaches to manage computational resources effectively. One such method is the Early Exit strategy, which allows for adaptive computation by providing a mechanism to shorten the processing path for simpler data instances. In this paper, we propose EERO, a new methodology to translate the problem of early exiting to a problem of using multiple classifiers with reject option in order to better select the exiting head for each instance. We calibrate the probabilities of exiting at the different heads using aggregation with exponential weights to guarantee a fixed budget .We consider factors such as Bayesian risk, budget constraints, and head-specific budget consumption. Experimental results, conducted using a ResNet-18 model and a ConvNext architecture on Cifar and ImageNet datasets, demonstrate that our method not only effectively manages budget allocation but also enhances accuracy in overthinking scenarios.
Paper Structure (21 sections, 6 theorems, 48 equations, 8 figures, 3 tables, 1 algorithm)

This paper contains 21 sections, 6 theorems, 48 equations, 8 figures, 3 tables, 1 algorithm.

Key Result

Proposition 3.2

Assume the cumulative distribution function (CDF) $F_{s}$ of $s(\mathbf{X})$ is continuous. Then, for all $\mathbf{x} \in \mathcal{X}$ In particular, we have $\mathbb{P}(h_{\varepsilon}^*(\mathbf{X}) = \mathfrak{R} ) = 1-\varepsilon$.

Figures (8)

  • Figure 1: Illustration of the Early Exit principle in a convolutional architecture.
  • Figure 2: Accuracy w.r.t. the budget for our EERO methodology based on Convnext against other known methods such as Patience zhou2020bertzhang-etal-2022-pcee, Geometric distribution huang2017multielbayad2020depthadaptivetransformer or Gaussian li2022predictiveexitpredictionfinegrained distribution of weights on each head. ConvNext Main points correspond to the accuracy of each head alone.
  • Figure 3: Measured and allowed budget of our algorithm on Convnext. This figure shows that our method accurately follows the budget given by never exceeding it.
  • Figure 4: Value of the aggregation weights $\hat{\varepsilon}^{\ell}$ on an intermediate budget for the ConvNext model for EERO. Each bar represents an exit head $\ell$.
  • Figure 5: Subfigures (a) and (b) compare EERO with MSDNet's original method, focusing on accuracy and budget metrics. Error bars were made using bootstrap on data.
  • ...and 3 more figures

Theorems & Definitions (19)

  • Definition 3.1
  • Proposition 3.2
  • Definition 3.3
  • Remark 3.4
  • Proposition 3.5
  • Remark 3.6
  • Remark 3.7
  • Proposition 3.8
  • Theorem 3.9
  • Remark 3.10
  • ...and 9 more