Table of Contents
Fetching ...

A Performance Analysis of You Only Look Once Models for Deployment on Constrained Computational Edge Devices in Drone Applications

Lucas Rey, Ana M. Bernardos, Andrzej D. Dobrzycki, David Carramiñana, Luca Bergesio, Juan A. Besada, José Ramón Casar

TL;DR

The paper tackles the challenge of running real-time YOLOv8 object detection on energy-constrained UAV edge devices. It systematically compares YOLOv8n and YOLOv8s across NVIDIA Jetson Orin Nano, Jetson Orin NX, and Raspberry Pi 5, examining FP16 and INT8 quantization within an end-to-end drone image-processing pipeline and contrasting edge versus cloud processing. Key findings show INT8 delivers the fastest inference but can reduce accuracy (e.g., mAP50-95 drops from ~0.861 to ~0.797 in some cases), while FP16 often provides a favorable balance between speed and accuracy; Orin NX provides the strongest edge performance, and the Raspberry Pi 5 struggles to meet real-time demands. The results offer practical guidelines for selecting device-quantization configurations to optimize speed, accuracy, and energy efficiency in constrained UAV deployments, and point to hybrid edge-cloud approaches and future work with newer YOLO variants and digital-twin simulations.

Abstract

Advancements in embedded systems and Artificial Intelligence (AI) have enhanced the capabilities of Unmanned Aircraft Vehicles (UAVs) in computer vision. However, the integration of AI techniques o-nboard drones is constrained by their processing capabilities. In this sense, this study evaluates the deployment of object detection models (YOLOv8n and YOLOv8s) on both resource-constrained edge devices and cloud environments. The objective is to carry out a comparative performance analysis using a representative real-time UAV image processing pipeline. Specifically, the NVIDIA Jetson Orin Nano, Orin NX, and Raspberry Pi 5 (RPI5) devices have been tested to measure their detection accuracy, inference speed, and energy consumption, and the effects of post-training quantization (PTQ). The results show that YOLOv8n surpasses YOLOv8s in its inference speed, achieving 52 FPS on the Jetson Orin NX and 65 fps with INT8 quantization. Conversely, the RPI5 failed to satisfy the real-time processing needs in spite of its suitability for low-energy consumption applications. An analysis of both the cloud-based and edge-based end-to-end processing times showed that increased communication latencies hindered real-time applications, revealing trade-offs between edge (low latency) and cloud processing (quick processing). Overall, these findings contribute to providing recommendations and optimization strategies for the deployment of AI models on UAVs.

A Performance Analysis of You Only Look Once Models for Deployment on Constrained Computational Edge Devices in Drone Applications

TL;DR

The paper tackles the challenge of running real-time YOLOv8 object detection on energy-constrained UAV edge devices. It systematically compares YOLOv8n and YOLOv8s across NVIDIA Jetson Orin Nano, Jetson Orin NX, and Raspberry Pi 5, examining FP16 and INT8 quantization within an end-to-end drone image-processing pipeline and contrasting edge versus cloud processing. Key findings show INT8 delivers the fastest inference but can reduce accuracy (e.g., mAP50-95 drops from ~0.861 to ~0.797 in some cases), while FP16 often provides a favorable balance between speed and accuracy; Orin NX provides the strongest edge performance, and the Raspberry Pi 5 struggles to meet real-time demands. The results offer practical guidelines for selecting device-quantization configurations to optimize speed, accuracy, and energy efficiency in constrained UAV deployments, and point to hybrid edge-cloud approaches and future work with newer YOLO variants and digital-twin simulations.

Abstract

Advancements in embedded systems and Artificial Intelligence (AI) have enhanced the capabilities of Unmanned Aircraft Vehicles (UAVs) in computer vision. However, the integration of AI techniques o-nboard drones is constrained by their processing capabilities. In this sense, this study evaluates the deployment of object detection models (YOLOv8n and YOLOv8s) on both resource-constrained edge devices and cloud environments. The objective is to carry out a comparative performance analysis using a representative real-time UAV image processing pipeline. Specifically, the NVIDIA Jetson Orin Nano, Orin NX, and Raspberry Pi 5 (RPI5) devices have been tested to measure their detection accuracy, inference speed, and energy consumption, and the effects of post-training quantization (PTQ). The results show that YOLOv8n surpasses YOLOv8s in its inference speed, achieving 52 FPS on the Jetson Orin NX and 65 fps with INT8 quantization. Conversely, the RPI5 failed to satisfy the real-time processing needs in spite of its suitability for low-energy consumption applications. An analysis of both the cloud-based and edge-based end-to-end processing times showed that increased communication latencies hindered real-time applications, revealing trade-offs between edge (low latency) and cloud processing (quick processing). Overall, these findings contribute to providing recommendations and optimization strategies for the deployment of AI models on UAVs.

Paper Structure

This paper contains 19 sections, 8 figures, 9 tables.

Figures (8)

  • Figure S1: Diagram of the indoor controlled flight environment equipped with OptiTrack sensors, drones with access points, and a TurtleBot target on the ground.
  • Figure S2: Examples of different dataset images where the object to be detected has been manually annotated. Images are captured from different angles and heights and with different background environments.
  • Figure S3: Quantization and deployment process of YOLOv8 models on the Jetson Orin Nano, Jetson Orin NX, and Raspberry Pi 5.
  • Figure S4: Figure showing the results on the mean iteration times of each model (YOLOv8s or YOLOv8n) with different quantization versions (FP32, FP16, or INT8) within a device (the Orin NX, Orin Nano, or Raspberry Pi 5).
  • Figure S5: Figure showing the results of the FPS tests of each isolated model (YOLOv8s or YOLOv8n) with different quantization versions (FP32, FP16, and INT8) within a device.
  • ...and 3 more figures