Table of Contents
Fetching ...

Circuits and Systems for Embodied AI: Exploring uJ Multi-Modal Perception for Nano-UAVs on the Kraken Shield

Viviane Potocnik, Alfio Di Mauro, Lorenzo Lamberti, Victor Kartsch, Moritz Scherer, Francesco Conti, Luca Benini

TL;DR

This paper explores embodied multi-modal AI-based perception for Nano-UAVs with the Kraken shield, a 7g multi-sensor (frame-based and event-based imagers) board based on Kraken, a 22 nm SoC featuring multiple acceleration engines for multi-modal event and frame-based inference based on spiking (SNN) and ternary (TNN) neural networks, respectively.

Abstract

Embodied artificial intelligence (AI) requires pushing complex multi-modal models to the extreme edge for time-constrained tasks such as autonomous navigation of robots and vehicles. On small form-factor devices, e.g., nano-sized unmanned aerial vehicles (UAVs), such challenges are exacerbated by stringent constraints on energy efficiency and weight. In this paper, we explore embodied multi-modal AI-based perception for Nano-UAVs with the Kraken shield, a 7g multi-sensor (frame-based and event-based imagers) board based on Kraken, a 22 nm SoC featuring multiple acceleration engines for multi-modal event and frame-based inference based on spiking (SNN) and ternary (TNN) neural networks, respectively. Kraken can execute SNN real-time inference for depth estimation at 1.02k inf/s, 18 μJ/inf, TNN real-time inference for object classification at 10k inf/s, 6 μJ/inf, and real-time inference for obstacle avoidance at 221 frame/s, 750 μJ/inf.

Circuits and Systems for Embodied AI: Exploring uJ Multi-Modal Perception for Nano-UAVs on the Kraken Shield

TL;DR

This paper explores embodied multi-modal AI-based perception for Nano-UAVs with the Kraken shield, a 7g multi-sensor (frame-based and event-based imagers) board based on Kraken, a 22 nm SoC featuring multiple acceleration engines for multi-modal event and frame-based inference based on spiking (SNN) and ternary (TNN) neural networks, respectively.

Abstract

Embodied artificial intelligence (AI) requires pushing complex multi-modal models to the extreme edge for time-constrained tasks such as autonomous navigation of robots and vehicles. On small form-factor devices, e.g., nano-sized unmanned aerial vehicles (UAVs), such challenges are exacerbated by stringent constraints on energy efficiency and weight. In this paper, we explore embodied multi-modal AI-based perception for Nano-UAVs with the Kraken shield, a 7g multi-sensor (frame-based and event-based imagers) board based on Kraken, a 22 nm SoC featuring multiple acceleration engines for multi-modal event and frame-based inference based on spiking (SNN) and ternary (TNN) neural networks, respectively. Kraken can execute SNN real-time inference for depth estimation at 1.02k inf/s, 18 μJ/inf, TNN real-time inference for object classification at 10k inf/s, 6 μJ/inf, and real-time inference for obstacle avoidance at 221 frame/s, 750 μJ/inf.

Paper Structure

This paper contains 10 sections, 5 figures.

Figures (5)

  • Figure 1: Architectural Block Diagram of the Kraken System on Chip (SoC). The diagram shows the FC, cluster, and accelerator domain hosting two accelerators, SNE and CUTIE for ML-based perception. The peripherals are centered around the FC, and data paths detailed in Section 4 are highlighted in a lilac shade.
  • Figure 2: Sparse Neural Engine (SNE) Architecture and Timing
  • Figure 3: Completely Unrolled Ternary Inference Engine (CUTIE) Architecture and Timing
  • Figure 4: Complete system overview
  • Figure 5: Kraken power waveform executing Tiny-PULP-Dronet at FC@280MHz, CL@300MHz, Vdd@0.8V.