Table of Contents
Fetching ...

NaviSplit: Dynamic Multi-Branch Split DNNs for Efficient Distributed Autonomous Navigation

Timothy K Johnsen, Ian Harshbarger, Zixia Xia, Marco Levorato

TL;DR

This work tackles the challenge of running perception-heavy navigation on resource-constrained UAVs by proposing NaviSplit, a dynamic multi-branch split DNN framework with supervised compression. It partitions depth estimation on-board and navigation on-edge, and introduces a gating auxiliary model to select among several encoder/decoder branches to balance channel usage and accuracy. Empirical results in the AirSim simulator show depth extraction accuracy in the range of 72–81% with 1.2–18 KB transmitted, and navigation accuracy of 82.5% with a roughly 95% reduction in data rate compared to a large static model; the gate-driven dynamic selection tends to use lighter branches on easier routes and heavier ones in more complex environments. Overall, NaviSplit is the first dynamic multi-branch split DNN for autonomous navigation, and open-source tooling for AirSim is released to promote adoption.

Abstract

Lightweight autonomous unmanned aerial vehicles (UAV) are emerging as a central component of a broad range of applications. However, autonomous navigation necessitates the implementation of perception algorithms, often deep neural networks (DNN), that process the input of sensor observations, such as that from cameras and LiDARs, for control logic. The complexity of such algorithms clashes with the severe constraints of these devices in terms of computing power, energy, memory, and execution time. In this paper, we propose NaviSplit, the first instance of a lightweight navigation framework embedding a distributed and dynamic multi-branched neural model. At its core is a DNN split at a compression point, resulting in two model parts: (1) the head model, that is executed at the vehicle, which partially processes and compacts perception from sensors; and (2) the tail model, that is executed at an interconnected compute-capable device, which processes the remainder of the compacted perception and infers navigation commands. Different from prior work, the NaviSplit framework includes a neural gate that dynamically selects a specific head model to minimize channel usage while efficiently supporting the navigation network. In our implementation, the perception model extracts a 2D depth map from a monocular RGB image captured by the drone using the robust simulator Microsoft AirSim. Our results demonstrate that the NaviSplit depth model achieves an extraction accuracy of 72-81% while transmitting an extremely small amount of data (1.2-18 KB) to the edge server. When using the neural gate, as utilized by NaviSplit, we obtain a slightly higher navigation accuracy as compared to a larger static network by 0.3% while significantly reducing the data rate by 95%. To the best of our knowledge, this is the first exemplar of dynamic multi-branched model based on split DNNs for autonomous navigation.

NaviSplit: Dynamic Multi-Branch Split DNNs for Efficient Distributed Autonomous Navigation

TL;DR

This work tackles the challenge of running perception-heavy navigation on resource-constrained UAVs by proposing NaviSplit, a dynamic multi-branch split DNN framework with supervised compression. It partitions depth estimation on-board and navigation on-edge, and introduces a gating auxiliary model to select among several encoder/decoder branches to balance channel usage and accuracy. Empirical results in the AirSim simulator show depth extraction accuracy in the range of 72–81% with 1.2–18 KB transmitted, and navigation accuracy of 82.5% with a roughly 95% reduction in data rate compared to a large static model; the gate-driven dynamic selection tends to use lighter branches on easier routes and heavier ones in more complex environments. Overall, NaviSplit is the first dynamic multi-branch split DNN for autonomous navigation, and open-source tooling for AirSim is released to promote adoption.

Abstract

Lightweight autonomous unmanned aerial vehicles (UAV) are emerging as a central component of a broad range of applications. However, autonomous navigation necessitates the implementation of perception algorithms, often deep neural networks (DNN), that process the input of sensor observations, such as that from cameras and LiDARs, for control logic. The complexity of such algorithms clashes with the severe constraints of these devices in terms of computing power, energy, memory, and execution time. In this paper, we propose NaviSplit, the first instance of a lightweight navigation framework embedding a distributed and dynamic multi-branched neural model. At its core is a DNN split at a compression point, resulting in two model parts: (1) the head model, that is executed at the vehicle, which partially processes and compacts perception from sensors; and (2) the tail model, that is executed at an interconnected compute-capable device, which processes the remainder of the compacted perception and infers navigation commands. Different from prior work, the NaviSplit framework includes a neural gate that dynamically selects a specific head model to minimize channel usage while efficiently supporting the navigation network. In our implementation, the perception model extracts a 2D depth map from a monocular RGB image captured by the drone using the robust simulator Microsoft AirSim. Our results demonstrate that the NaviSplit depth model achieves an extraction accuracy of 72-81% while transmitting an extremely small amount of data (1.2-18 KB) to the edge server. When using the neural gate, as utilized by NaviSplit, we obtain a slightly higher navigation accuracy as compared to a larger static network by 0.3% while significantly reducing the data rate by 95%. To the best of our knowledge, this is the first exemplar of dynamic multi-branched model based on split DNNs for autonomous navigation.
Paper Structure (10 sections, 6 equations, 3 figures, 1 table)

This paper contains 10 sections, 6 equations, 3 figures, 1 table.

Figures (3)

  • Figure 1: The framework we propose uses a teacher model that maps a monocular RGB image to a depth map. Several different split points with encoder/decoder are injected into the teacher model to make multiple student model branches capable of split computing -- of which an auxiliary model selects from given perceived context. Extracted depth maps are input to the navigation model that outputs motion actions used during drone navigation.
  • Figure 2: Comparing various compressed data sizes, corresponding to different models, versus resulting depth extraction error. The markers from left to right: for bottleneck models, range between a reduction in channels of [12, 24, 64]; for baseline models, range between a reduction in channels of [2, 4, 8, 16, 32]; and for JPEG models, range in quality of compression from 5 to 95.
  • Figure 3: Mean value of the gate control, $\hat{c}$, for all successful paths while using an auxiliary model to adapt and select $\hat{c}$ at each time step. This a demonstration of NaviSplit, which was trained and evaluated in Microsoft AirSim.