Table of Contents
Fetching ...

A Heterogeneous RISC-V based SoC for Secure Nano-UAV Navigation

Luca Valente, Alessandro Nadalini, Asif Veeran, Mattia Sinigaglia, Bruno Sa, Nils Wistoff, Yvan Tortorella, Simone Benatti, Rafail Psiakis, Ari Kulmala, Baker Mohammad, Sandro Pinto, Daniele Palossi, Luca Benini, Davide Rossi

TL;DR

Nano-UAVs require real-time ML and mature software stacks within very tight power and form-factor limits. Shaheen delivers a $9\,\mathrm{mm^2}$, $200\,\mathrm{mW}$ heterogeneous SoC in $22\,\mathrm{nm}$ FD-SOI that couples a Linux-capable $RV64$ host with Hypervisor extension and timing-channel protection to a programmable $RV32$ cluster, plus up to $512\,\mathrm{MB}$ HyperRAM. It introduces hardware virtualization (Hypervisor), timing-channel mitigation (fence.t), and a scalable 8-core Flex-V cluster with mixed-precision SIMD, achieving up to $90\,\mathrm{GOp/s}$ (2-bit integers) and up to $7.9\,\mathrm{GFLOp/s}$ with $150\,\mathrm{GFLOp/s/W}$ on FP workloads, as well as up to $1.8\mathrm{TOp/s/W}$ on low-precision kernels. The work demonstrates secure, multi-domain operation with a full OS and a programmable accelerator at nano-UAV power budgets, enabling practical onboard autonomy and ROS-enabled software stacks, with significant implications for next-generation AI-IoT UAV systems.

Abstract

The rapid advancement of energy-efficient parallel ultra-low-power (ULP) ucontrollers units (MCUs) is enabling the development of autonomous nano-sized unmanned aerial vehicles (nano-UAVs). These sub-10cm drones represent the next generation of unobtrusive robotic helpers and ubiquitous smart sensors. However, nano-UAVs face significant power and payload constraints while requiring advanced computing capabilities akin to standard drones, including real-time Machine Learning (ML) performance and the safe co-existence of general-purpose and real-time OSs. Although some advanced parallel ULP MCUs offer the necessary ML computing capabilities within the prescribed power limits, they rely on small main memories (<1MB) and ucontroller-class CPUs with no virtualization or security features, and hence only support simple bare-metal runtimes. In this work, we present Shaheen, a 9mm2 200mW SoC implemented in 22nm FDX technology. Differently from state-of-the-art MCUs, Shaheen integrates a Linux-capable RV64 core, compliant with the v1.0 ratified Hypervisor extension and equipped with timing channel protection, along with a low-cost and low-power memory controller exposing up to 512MB of off-chip low-cost low-power HyperRAM directly to the CPU. At the same time, it integrates a fully programmable energy- and area-efficient multi-core cluster of RV32 cores optimized for general-purpose DSP as well as reduced- and mixed-precision ML. To the best of the authors' knowledge, it is the first silicon prototype of a ULP SoC coupling the RV64 and RV32 cores in a heterogeneous host+accelerator architecture fully based on the RISC-V ISA. We demonstrate the capabilities of the proposed SoC on a wide range of benchmarks relevant to nano-UAV applications. The cluster can deliver up to 90GOp/s and up to 1.8TOp/s/W on 2-bit integer kernels and up to 7.9GFLOp/s and up to 150GFLOp/s/W on 16-bit FP kernels.

A Heterogeneous RISC-V based SoC for Secure Nano-UAV Navigation

TL;DR

Nano-UAVs require real-time ML and mature software stacks within very tight power and form-factor limits. Shaheen delivers a , heterogeneous SoC in FD-SOI that couples a Linux-capable host with Hypervisor extension and timing-channel protection to a programmable cluster, plus up to HyperRAM. It introduces hardware virtualization (Hypervisor), timing-channel mitigation (fence.t), and a scalable 8-core Flex-V cluster with mixed-precision SIMD, achieving up to (2-bit integers) and up to with on FP workloads, as well as up to on low-precision kernels. The work demonstrates secure, multi-domain operation with a full OS and a programmable accelerator at nano-UAV power budgets, enabling practical onboard autonomy and ROS-enabled software stacks, with significant implications for next-generation AI-IoT UAV systems.

Abstract

The rapid advancement of energy-efficient parallel ultra-low-power (ULP) ucontrollers units (MCUs) is enabling the development of autonomous nano-sized unmanned aerial vehicles (nano-UAVs). These sub-10cm drones represent the next generation of unobtrusive robotic helpers and ubiquitous smart sensors. However, nano-UAVs face significant power and payload constraints while requiring advanced computing capabilities akin to standard drones, including real-time Machine Learning (ML) performance and the safe co-existence of general-purpose and real-time OSs. Although some advanced parallel ULP MCUs offer the necessary ML computing capabilities within the prescribed power limits, they rely on small main memories (<1MB) and ucontroller-class CPUs with no virtualization or security features, and hence only support simple bare-metal runtimes. In this work, we present Shaheen, a 9mm2 200mW SoC implemented in 22nm FDX technology. Differently from state-of-the-art MCUs, Shaheen integrates a Linux-capable RV64 core, compliant with the v1.0 ratified Hypervisor extension and equipped with timing channel protection, along with a low-cost and low-power memory controller exposing up to 512MB of off-chip low-cost low-power HyperRAM directly to the CPU. At the same time, it integrates a fully programmable energy- and area-efficient multi-core cluster of RV32 cores optimized for general-purpose DSP as well as reduced- and mixed-precision ML. To the best of the authors' knowledge, it is the first silicon prototype of a ULP SoC coupling the RV64 and RV32 cores in a heterogeneous host+accelerator architecture fully based on the RISC-V ISA. We demonstrate the capabilities of the proposed SoC on a wide range of benchmarks relevant to nano-UAV applications. The cluster can deliver up to 90GOp/s and up to 1.8TOp/s/W on 2-bit integer kernels and up to 7.9GFLOp/s and up to 150GFLOp/s/W on 16-bit FP kernels.
Paper Structure (23 sections, 15 figures, 7 tables)

This paper contains 23 sections, 15 figures, 7 tables.

Figures (15)

  • Figure 1: Shaheen architecture block diagram.
  • Figure 2: RISC-V privilege levels.
  • Figure 3: Channel matrices on the CHANNEL BENCH test.
  • Figure 4: HyperRAM memory controller architecture.
  • Figure 5: Instruction decoding during the status-based execution.
  • ...and 10 more figures