Table of Contents
Fetching ...

Low Latency Visual Inertial Odometry with On-Sensor Accelerated Optical Flow for Resource-Constrained UAVs

Jonas Kühne, Michele Magno, Luca Benini

TL;DR

The paper addresses the challenge of achieving low-latency, energy-efficient visual inertial odometry on resource-constrained UAVs by moving the optical flow computation onto an on-sensor ASIC (VD56G3). It introduces OF VINS-Mono, a hardware-software co-design that replaces the host feature tracker with on-sensor optical flow data while keeping the VINS-Mono estimator on a Raspberry Pi Compute Module 4. The authors demonstrate substantial latency (about 49.4%) and compute-load reductions (about 53.7%), enabling higher effective frame rates (up to 50 FPS) with competitive tracking accuracy, and they provide a new dataset with ground-truth poses. The practical impact lies in enabling robust, real-time VIO on small, power-limited UAVs and potentially extending to AR/VR and nano-drones through further hardware scaling.

Abstract

Visual Inertial Odometry (VIO) is the task of estimating the movement trajectory of an agent from an onboard camera stream fused with additional Inertial Measurement Unit (IMU) measurements. A crucial subtask within VIO is the tracking of features, which can be achieved through Optical Flow (OF). As the calculation of OF is a resource-demanding task in terms of computational load and memory footprint, which needs to be executed at low latency, especially in robotic applications, OF estimation is today performed on powerful CPUs or GPUs. This restricts its use in a broad spectrum of applications where the deployment of such powerful, power-hungry processors is unfeasible due to constraints related to cost, size, and power consumption. On-sensor hardware acceleration is a promising approach to enable low latency VIO even on resource-constrained devices such as nano drones. This paper assesses the speed-up in a VIO sensor system exploiting a compact OF sensor consisting of a global shutter camera and an Application Specific Integrated Circuit (ASIC). By replacing the feature tracking logic of the VINS-Mono pipeline with data from this OF camera, we demonstrate a 49.4% reduction in latency and a 53.7% reduction of compute load of the VIO pipeline over the original VINS-Mono implementation, allowing VINS-Mono operation up to 50 FPS instead of 20 FPS on the quad-core ARM Cortex-A72 processor of a Raspberry Pi Compute Module 4.

Low Latency Visual Inertial Odometry with On-Sensor Accelerated Optical Flow for Resource-Constrained UAVs

TL;DR

The paper addresses the challenge of achieving low-latency, energy-efficient visual inertial odometry on resource-constrained UAVs by moving the optical flow computation onto an on-sensor ASIC (VD56G3). It introduces OF VINS-Mono, a hardware-software co-design that replaces the host feature tracker with on-sensor optical flow data while keeping the VINS-Mono estimator on a Raspberry Pi Compute Module 4. The authors demonstrate substantial latency (about 49.4%) and compute-load reductions (about 53.7%), enabling higher effective frame rates (up to 50 FPS) with competitive tracking accuracy, and they provide a new dataset with ground-truth poses. The practical impact lies in enabling robust, real-time VIO on small, power-limited UAVs and potentially extending to AR/VR and nano-drones through further hardware scaling.

Abstract

Visual Inertial Odometry (VIO) is the task of estimating the movement trajectory of an agent from an onboard camera stream fused with additional Inertial Measurement Unit (IMU) measurements. A crucial subtask within VIO is the tracking of features, which can be achieved through Optical Flow (OF). As the calculation of OF is a resource-demanding task in terms of computational load and memory footprint, which needs to be executed at low latency, especially in robotic applications, OF estimation is today performed on powerful CPUs or GPUs. This restricts its use in a broad spectrum of applications where the deployment of such powerful, power-hungry processors is unfeasible due to constraints related to cost, size, and power consumption. On-sensor hardware acceleration is a promising approach to enable low latency VIO even on resource-constrained devices such as nano drones. This paper assesses the speed-up in a VIO sensor system exploiting a compact OF sensor consisting of a global shutter camera and an Application Specific Integrated Circuit (ASIC). By replacing the feature tracking logic of the VINS-Mono pipeline with data from this OF camera, we demonstrate a 49.4% reduction in latency and a 53.7% reduction of compute load of the VIO pipeline over the original VINS-Mono implementation, allowing VINS-Mono operation up to 50 FPS instead of 20 FPS on the quad-core ARM Cortex-A72 processor of a Raspberry Pi Compute Module 4.
Paper Structure (19 sections, 6 figures, 6 tables)

This paper contains 19 sections, 6 figures, 6 tables.

Figures (6)

  • Figure 1: The hardware system consists of the VD56G3 sensor, an MPU6500 IMU, and a Raspberry Pi Compute Module 4. Markers were placed on the sensor to record ground truth poses with a VICON motion capture system. The main software components of the OF VINS-Mono pipeline consist of the IMU pre-integrator module, the feature tracker, which concatenates optical flow vectors to feature tracks, the pose estimation module which fuses the sensor information of the optical flow sensor together with the IMU data to estimate pose changes, and the pose graph module, which handles loop closures and optimizes the obtained pose graph.
  • Figure 2: Left: Sample trajectory (in blue) of the movement in the 4-meter by 4-meter room captured by the VICON system. The smaller loops in the bottom-left corner of the plot were performed to align the ground-truth recording with the predictions of both VINS-Mono systems. Right: The three different camera orientations are indicated: in the movement direction (orange), perpendicular to the movement direction (green), and at 45 degrees to the movement direction (red).
  • Figure 3: The plots show the translational and rotational errors of randomly sampled sub-trajectories of the indicated lengths for the BRIEF 150 parameter set. The top plots correspond to the camera pointing in the direction of movement, whereas in the bottom plot, the camera is oriented perpendicularly to the movement direction. Although the rotational drift in the top plot is much smaller, we observed that it is systematic for both the original VINS-Mono and OF VINS-Mono, leading to an accumulation of the error, which is also reflected in the larger translation error compared to the bottom plot.
  • Figure 4: Latency breakdown for the calculation of one odometry estimation for both implementations. The IMU pre-integration is omitted in the diagram for better readability.
  • Figure 5: A comparison of the system power draw of the original VINS-Mono pipeline versus OF VINS-Mono. Three phases are being shown, the idle consumption of the Raspberry Pi CM4, the power consumption when only the camera capture is enabled and the power draw when either VIO pipeline is being fully operational. The plot shows both the averaged (opaque lines) and non-averaged (transparent lines) system power draw.
  • ...and 1 more figures