Table of Contents
Fetching ...

PERTINENCE: Input-based Opportunistic Neural Network Dynamic Execution

Omkar Shende, Gayathri Ananthanarayanan, Marcello Traiola

TL;DR

PERTINENCE addresses the challenge of deploying accurate yet efficient DNNs by enabling an online input-based dispatcher that selects among a pre-trained set of models. It treats dispatch as a multi-objective optimization problem, using NSGA-II-like evolution to explore penalties and weightings, thereby building a Pareto-front of models with associated dispatch rules. The method is validated on CNNs and ViTs across CIFAR-10, CIFAR-100, and Tiny ImageNet, and demonstrated in a real-time road-occupancy case study, achieving up to $36 ext{ extpercent}$ fewer operations without sacrificing accuracy and often improving it. The work shows that dynamic, input-aware model combination can significantly reduce computational cost and energy consumption in practical deployments, with broad applicability to resource-constrained edge devices and streaming video tasks.

Abstract

Deep neural networks (DNNs) have become ubiquitous thanks to their remarkable ability to model complex patterns across various domains such as computer vision, speech recognition, robotics, etc. While large DNN models are often more accurate than simpler, lightweight models, they are also resource- and energy-hungry. Hence, it is imperative to design methods to reduce reliance on such large models without significant degradation in output accuracy. The high computational cost of these models is often necessary only for a reduced set of challenging inputs, while lighter models can handle most simple ones. Thus, carefully combining properties of existing DNN models in a dynamic, input-based way opens opportunities to improve efficiency without impacting accuracy. In this work, we introduce PERTINENCE, a novel online method designed to analyze the complexity of input features and dynamically select the most suitable model from a pre-trained set to process a given input effectively. To achieve this, we employ a genetic algorithm to explore the training space of an ML-based input dispatcher, enabling convergence towards the Pareto front in the solution space that balances overall accuracy and computational efficiency. We showcase our approach on state-of-the-art Convolutional Neural Networks (CNNs) trained on the CIFAR-10 and CIFAR-100, as well as Vision Transformers (ViTs) trained on TinyImageNet dataset. We report results showing PERTINENCE's ability to provide alternative solutions to existing state-of-the-art models in terms of trade-offs between accuracy and number of operations. By opportunistically selecting among models trained for the same task, PERTINENCE achieves better or comparable accuracy with up to 36% fewer operations.

PERTINENCE: Input-based Opportunistic Neural Network Dynamic Execution

TL;DR

PERTINENCE addresses the challenge of deploying accurate yet efficient DNNs by enabling an online input-based dispatcher that selects among a pre-trained set of models. It treats dispatch as a multi-objective optimization problem, using NSGA-II-like evolution to explore penalties and weightings, thereby building a Pareto-front of models with associated dispatch rules. The method is validated on CNNs and ViTs across CIFAR-10, CIFAR-100, and Tiny ImageNet, and demonstrated in a real-time road-occupancy case study, achieving up to fewer operations without sacrificing accuracy and often improving it. The work shows that dynamic, input-aware model combination can significantly reduce computational cost and energy consumption in practical deployments, with broad applicability to resource-constrained edge devices and streaming video tasks.

Abstract

Deep neural networks (DNNs) have become ubiquitous thanks to their remarkable ability to model complex patterns across various domains such as computer vision, speech recognition, robotics, etc. While large DNN models are often more accurate than simpler, lightweight models, they are also resource- and energy-hungry. Hence, it is imperative to design methods to reduce reliance on such large models without significant degradation in output accuracy. The high computational cost of these models is often necessary only for a reduced set of challenging inputs, while lighter models can handle most simple ones. Thus, carefully combining properties of existing DNN models in a dynamic, input-based way opens opportunities to improve efficiency without impacting accuracy. In this work, we introduce PERTINENCE, a novel online method designed to analyze the complexity of input features and dynamically select the most suitable model from a pre-trained set to process a given input effectively. To achieve this, we employ a genetic algorithm to explore the training space of an ML-based input dispatcher, enabling convergence towards the Pareto front in the solution space that balances overall accuracy and computational efficiency. We showcase our approach on state-of-the-art Convolutional Neural Networks (CNNs) trained on the CIFAR-10 and CIFAR-100, as well as Vision Transformers (ViTs) trained on TinyImageNet dataset. We report results showing PERTINENCE's ability to provide alternative solutions to existing state-of-the-art models in terms of trade-offs between accuracy and number of operations. By opportunistically selecting among models trained for the same task, PERTINENCE achieves better or comparable accuracy with up to 36% fewer operations.

Paper Structure

This paper contains 15 sections, 2 equations, 19 figures, 4 tables.

Figures (19)

  • Figure 1: MFLOPs- Accuracy tradeoff for three different CNN models trained on CIFAR-10 dataset compared with using an ideal Input Dispatcher to use the three CNNs depending on the input opportunistically.
  • Figure 2: PERTINENCE Approach
  • Figure 3: Proposed Input Dispatcher Design
  • Figure 4: Pareto Front of DNN models trained on (a) CIFAR-10 (CNNs), (b) CIFAR-100 (CNNs), and (c) Tiny Imagenet (ViTs) datasets.
  • Figure 5: Flow of the proposed PERTINENCE approach
  • ...and 14 more figures