Table of Contents
Fetching ...

InstantNet: Automated Generation and Deployment of Instantaneously Switchable-Precision Networks

Yonggan Fu, Zhongzhi Yu, Yongan Zhang, Yifan Jiang, Chaojian Li, Yongyuan Liang, Mingchao Jiang, Zhangyang Wang, Yingyan Celine Lin

TL;DR

InstantNet tackles the problem of rapidly developing and deploying DNNs that can instantaneously trade accuracy for efficiency on IoT hardware. It combines Bit-Wise Cascade Distillation (CDT) for multi-width accuracy, Switchable-Precision NAS (SP-NAS) for width-aware network design, and Evolutionary AutoMapper for automatic dataflow mapping on devices, forming an end-to-end pipeline. Empirical results show CDT improves low-bit accuracy, SP-NAS achieves strong performance at low bit-widths with notable FLOPs reductions, and AutoMapper delivers substantial EDP savings over expert-crafted flows, leading to up to 84.68% EDP improvement on CIFAR-100 and a 1.86x FPS gain on ImageNet. Collectively, InstantNet enables scalable, automated development and deployment of SP-Nets across diverse IoT hardware, accelerating practical adoption of efficient DNNs in resource-constrained environments.

Abstract

The promise of Deep Neural Network (DNN) powered Internet of Thing (IoT) devices has motivated a tremendous demand for automated solutions to enable fast development and deployment of efficient (1) DNNs equipped with instantaneous accuracy-efficiency trade-off capability to accommodate the time-varying resources at IoT devices and (2) dataflows to optimize DNNs' execution efficiency on different devices. Therefore, we propose InstantNet to automatically generate and deploy instantaneously switchable-precision networks which operate at variable bit-widths. Extensive experiments show that the proposed InstantNet consistently outperforms state-of-the-art designs.

InstantNet: Automated Generation and Deployment of Instantaneously Switchable-Precision Networks

TL;DR

InstantNet tackles the problem of rapidly developing and deploying DNNs that can instantaneously trade accuracy for efficiency on IoT hardware. It combines Bit-Wise Cascade Distillation (CDT) for multi-width accuracy, Switchable-Precision NAS (SP-NAS) for width-aware network design, and Evolutionary AutoMapper for automatic dataflow mapping on devices, forming an end-to-end pipeline. Empirical results show CDT improves low-bit accuracy, SP-NAS achieves strong performance at low bit-widths with notable FLOPs reductions, and AutoMapper delivers substantial EDP savings over expert-crafted flows, leading to up to 84.68% EDP improvement on CIFAR-100 and a 1.86x FPS gain on ImageNet. Collectively, InstantNet enables scalable, automated development and deployment of SP-Nets across diverse IoT hardware, accelerating practical adoption of efficient DNNs in resource-constrained environments.

Abstract

The promise of Deep Neural Network (DNN) powered Internet of Thing (IoT) devices has motivated a tremendous demand for automated solutions to enable fast development and deployment of efficient (1) DNNs equipped with instantaneous accuracy-efficiency trade-off capability to accommodate the time-varying resources at IoT devices and (2) dataflows to optimize DNNs' execution efficiency on different devices. Therefore, we propose InstantNet to automatically generate and deploy instantaneously switchable-precision networks which operate at variable bit-widths. Extensive experiments show that the proposed InstantNet consistently outperforms state-of-the-art designs.

Paper Structure

This paper contains 16 sections, 2 equations, 8 figures, 4 tables, 1 algorithm.

Figures (8)

  • Figure 1: Overview of InstantNet, which first generates SP-Nets with high accuracy under all bit-widths, and then suggests dataflows to maximize the generated SP-Nets' execution efficiency under different bit-widths on the target device.
  • Figure 2: Visualizing the prediction distribution of MobileNetV2 on CIFAR-100 under (left): 4-bit training with vanilla distillation, (middle) 4-bit training with the proposed CDT, and (right) 32-bit training.
  • Figure 3: Overview of the goal, generic dataflow space, and InstantNet's AutoMapper, where TBS denotes "to be searched".
  • Figure 4: InstantNet's SP-NAS over Full-Precision-NAS (FP-NAS) and Low-Precision-NAS (LP-NAS) on CIFAR-100 under large, middle, and small FLOPs constraints trained for two bit-width sets: (a) [4, 8, 12, 16, 32], and (b) [4, 5, 6, 8].
  • Figure 5: AutoMapper over SOTA expert-crafted and tool generated dataflows on FPGA/ASIC.
  • ...and 3 more figures