Table of Contents
Fetching ...

YolovN-CBi: A Lightweight and Efficient Architecture for Real-Time Detection of Small UAVs

Ami Pandat, Punna Rajasekhar, Gopika Vinod, Rohit Shukla

TL;DR

The paper tackles real-time detection of small UAVs in challenging aerial imagery by enhancing YOLO backbones with CBAM attention and BiFPN-inspired fusion, producing YolovN-CBi variants. It introduces a Flying Object FO training dataset (28k images) and leverages multiple public and local test sets to assess small-object performance, showing Yolov5-CBi variants outperform newer YOLO versions in speed-accuracy for tiny drones. A comprehensive knowledge-distillation study demonstrates that adversarial distillation yields compact, edge-friendly Yolov5n-CBi students that match or exceed teacher performance while achieving substantial speedups (up to ~83% faster). The findings establish that carefully placed attention and multiscale fusion, coupled with distillation, yield lightweight, robust drone detectors suitable for real-time surveillance on resource-constrained platforms.

Abstract

Unmanned Aerial Vehicles, commonly known as, drones pose increasing risks in civilian and defense settings, demanding accurate and real-time drone detection systems. However, detecting drones is challenging because of their small size, rapid movement, and low visual contrast. A modified architecture of YolovN called the YolovN-CBi is proposed that incorporates the Convolutional Block Attention Module (CBAM) and the Bidirectional Feature Pyramid Network (BiFPN) to improve sensitivity to small object detections. A curated training dataset consisting of 28K images is created with various flying objects and a local test dataset is collected with 2500 images consisting of very small drone objects. The proposed architecture is evaluated on four benchmark datasets, along with the local test dataset. The baseline Yolov5 and the proposed Yolov5-CBi architecture outperform newer Yolo versions, including Yolov8 and Yolov12, in the speed-accuracy trade-off for small object detection. Four other variants of the proposed CBi architecture are also proposed and evaluated, which vary in the placement and usage of CBAM and BiFPN. These variants are further distilled using knowledge distillation techniques for edge deployment, using a Yolov5m-CBi teacher and a Yolov5n-CBi student. The distilled model achieved a mA@P0.5:0.9 of 0.6573, representing a 6.51% improvement over the teacher's score of 0.6171, highlighting the effectiveness of the distillation process. The distilled model is 82.9% faster than the baseline model, making it more suitable for real-time drone detection. These findings highlight the effectiveness of the proposed CBi architecture, together with the distilled lightweight models in advancing efficient and accurate real-time detection of small UAVs.

YolovN-CBi: A Lightweight and Efficient Architecture for Real-Time Detection of Small UAVs

TL;DR

The paper tackles real-time detection of small UAVs in challenging aerial imagery by enhancing YOLO backbones with CBAM attention and BiFPN-inspired fusion, producing YolovN-CBi variants. It introduces a Flying Object FO training dataset (28k images) and leverages multiple public and local test sets to assess small-object performance, showing Yolov5-CBi variants outperform newer YOLO versions in speed-accuracy for tiny drones. A comprehensive knowledge-distillation study demonstrates that adversarial distillation yields compact, edge-friendly Yolov5n-CBi students that match or exceed teacher performance while achieving substantial speedups (up to ~83% faster). The findings establish that carefully placed attention and multiscale fusion, coupled with distillation, yield lightweight, robust drone detectors suitable for real-time surveillance on resource-constrained platforms.

Abstract

Unmanned Aerial Vehicles, commonly known as, drones pose increasing risks in civilian and defense settings, demanding accurate and real-time drone detection systems. However, detecting drones is challenging because of their small size, rapid movement, and low visual contrast. A modified architecture of YolovN called the YolovN-CBi is proposed that incorporates the Convolutional Block Attention Module (CBAM) and the Bidirectional Feature Pyramid Network (BiFPN) to improve sensitivity to small object detections. A curated training dataset consisting of 28K images is created with various flying objects and a local test dataset is collected with 2500 images consisting of very small drone objects. The proposed architecture is evaluated on four benchmark datasets, along with the local test dataset. The baseline Yolov5 and the proposed Yolov5-CBi architecture outperform newer Yolo versions, including Yolov8 and Yolov12, in the speed-accuracy trade-off for small object detection. Four other variants of the proposed CBi architecture are also proposed and evaluated, which vary in the placement and usage of CBAM and BiFPN. These variants are further distilled using knowledge distillation techniques for edge deployment, using a Yolov5m-CBi teacher and a Yolov5n-CBi student. The distilled model achieved a mA@P0.5:0.9 of 0.6573, representing a 6.51% improvement over the teacher's score of 0.6171, highlighting the effectiveness of the distillation process. The distilled model is 82.9% faster than the baseline model, making it more suitable for real-time drone detection. These findings highlight the effectiveness of the proposed CBi architecture, together with the distilled lightweight models in advancing efficient and accurate real-time detection of small UAVs.

Paper Structure

This paper contains 22 sections, 5 equations, 11 figures, 6 tables.

Figures (11)

  • Figure 1: Convolutional Bottleneck Attention Module (CBAM) architecture
  • Figure 2: YOLOv5 architecture with BiFPN-inspired enhancement. Purple arrows represent added connections from backbone to neck that help preserve small-scale features.
  • Figure 3: Yolov5-CBi architecture: Yolov5 enhanced with CBAM in backbone and BiFPN for improved attention and multiscale fusion. BiFPN integration highlighted with Purple arrows connecting backbone blocks to Neck at different scales. CBAM is placed before SPPF block in Backbone.
  • Figure 4: Early Attention: CBAM introduced before concatenating features of C3 layers from backbone
  • Figure 5: C3b: Modified C3 Block with CBAM replacing bottleneck blocks
  • ...and 6 more figures