Table of Contents
Fetching ...

Fluid Dynamic DNNs for Reliable and Adaptive Distributed Inference on Edge Devices

Lei Xun, Mingyu Hu, Hengrui Zhao, Amit Kumar Singh, Jonathon Hare, Geoff V. Merrett

TL;DR

Fluid Dynamic DNNs address reliability and adaptability gaps in distributed edge inference by decoupling sub-networks through a nested incremental training regime. The approach yields independent and jointly usable sub-networks, enabling High-Throughput and High-Accuracy modes and resilience to single-device failures. Empirical results on MNIST with a small CNN show Fluid DyDNNs outperform Static and Dynamic baselines in reliability under failure, and achieve up to $2.5\times$ throughput in HT mode and possibly recover peak accuracy in HA mode when both devices are online. This work offers a practical, scalable strategy for robust distributed inference on edge devices, with accessible training procedures and demonstrable gains on embedded CPUs.

Abstract

Distributed inference is a popular approach for efficient DNN inference at the edge. However, traditional Static and Dynamic DNNs are not distribution-friendly, causing system reliability and adaptability issues. In this paper, we introduce Fluid Dynamic DNNs (Fluid DyDNNs), tailored for distributed inference. Distinct from Static and Dynamic DNNs, Fluid DyDNNs utilize a novel nested incremental training algorithm to enable independent and combined operation of its sub-networks, enhancing system reliability and adaptability. Evaluation on embedded Arm CPUs with a DNN model and the MNIST dataset, shows that in scenarios of single device failure, Fluid DyDNNs ensure continued inference, whereas Static and Dynamic DNNs fail. When devices are fully operational, Fluid DyDNNs can operate in either a High-Accuracy mode and achieve comparable accuracy with Static DNNs, or in a High-Throughput mode and achieve 2.5x and 2x throughput compared with Static and Dynamic DNNs, respectively.

Fluid Dynamic DNNs for Reliable and Adaptive Distributed Inference on Edge Devices

TL;DR

Fluid Dynamic DNNs address reliability and adaptability gaps in distributed edge inference by decoupling sub-networks through a nested incremental training regime. The approach yields independent and jointly usable sub-networks, enabling High-Throughput and High-Accuracy modes and resilience to single-device failures. Empirical results on MNIST with a small CNN show Fluid DyDNNs outperform Static and Dynamic baselines in reliability under failure, and achieve up to throughput in HT mode and possibly recover peak accuracy in HA mode when both devices are online. This work offers a practical, scalable strategy for robust distributed inference on edge devices, with accessible training procedures and demonstrable gains on embedded CPUs.

Abstract

Distributed inference is a popular approach for efficient DNN inference at the edge. However, traditional Static and Dynamic DNNs are not distribution-friendly, causing system reliability and adaptability issues. In this paper, we introduce Fluid Dynamic DNNs (Fluid DyDNNs), tailored for distributed inference. Distinct from Static and Dynamic DNNs, Fluid DyDNNs utilize a novel nested incremental training algorithm to enable independent and combined operation of its sub-networks, enhancing system reliability and adaptability. Evaluation on embedded Arm CPUs with a DNN model and the MNIST dataset, shows that in scenarios of single device failure, Fluid DyDNNs ensure continued inference, whereas Static and Dynamic DNNs fail. When devices are fully operational, Fluid DyDNNs can operate in either a High-Accuracy mode and achieve comparable accuracy with Static DNNs, or in a High-Throughput mode and achieve 2.5x and 2x throughput compared with Static and Dynamic DNNs, respectively.
Paper Structure (7 sections, 2 figures, 1 algorithm)

This paper contains 7 sections, 2 figures, 1 algorithm.

Figures (2)

  • Figure 1: (a) Overview of three types of DNN model (b) under worker device failure and (c) master device failure scenarios of distributed DNN inference.
  • Figure 2: Experimental results of throughput and accuracy of Static DNNs, Dynamic DNNs xun2019incremental and Fluid DyDNNs under High-Accuracy (HA) and High-Throughput (HT) mode. Inference results were collected when only the Master is online, only the Worker is online, and when both devices are online.