Table of Contents
Fetching ...

UCDNet: Multi-UAV Collaborative 3D Object Detection Network by Reliable Feature Mapping

Pengju Tian, Peirui Cheng, Yuchao Wang, Zhechao Wang, Zhirui Wang, Menglong Yan, Xue Yang, Xian Sun

TL;DR

UCDNet tackles the challenge of precise and consistent 3D object detection in multi-UAV scenarios by introducing Ground-Prior-Guided Feature Mapping (GFM) to exploit ground depth as a strong prior for feature mapping, and a Homologous Point Geometric Consistency Loss (HPL) to enforce cross-view geometric consistency. The framework fuses features across multiple UAVs in BEV space through a four-stage pipeline (image encoding, feature mapping, fusion, and detection) and is validated on AeroCollab3D, a CARLA-based UAV dataset, and CoPerception-UAVs, showing substantial mAP gains over baselines (e.g., +4.7% on AeroCollab3D and +10% on CoPerception-UAVs). A dedicated AeroCollab3D dataset with diverse maps and high-resolution imagery supports robust evaluation under occlusion and wide-view conditions. The two proposed modules improve mapping reliability and convergence without increasing inference cost, highlighting the viability of camera-based, BEV-centric multi-UAV perception in practical UAV applications.

Abstract

Multi-UAV collaborative 3D object detection can perceive and comprehend complex environments by integrating complementary information, with applications encompassing traffic monitoring, delivery services and agricultural management. However, the extremely broad observations in aerial remote sensing and significant perspective differences across multiple UAVs make it challenging to achieve precise and consistent feature mapping from 2D images to 3D space in multi-UAV collaborative 3D object detection paradigm. To address the problem, we propose an unparalleled camera-based multi-UAV collaborative 3D object detection paradigm called UCDNet. Specifically, the depth information from the UAVs to the ground is explicitly utilized as a strong prior to provide a reference for more accurate and generalizable feature mapping. Additionally, we design a homologous points geometric consistency loss as an auxiliary self-supervision, which directly influences the feature mapping module, thereby strengthening the global consistency of multi-view perception. Experiments on AeroCollab3D and CoPerception-UAVs datasets show our method increases 4.7% and 10% mAP respectively compared to the baseline, which demonstrates the superiority of UCDNet.

UCDNet: Multi-UAV Collaborative 3D Object Detection Network by Reliable Feature Mapping

TL;DR

UCDNet tackles the challenge of precise and consistent 3D object detection in multi-UAV scenarios by introducing Ground-Prior-Guided Feature Mapping (GFM) to exploit ground depth as a strong prior for feature mapping, and a Homologous Point Geometric Consistency Loss (HPL) to enforce cross-view geometric consistency. The framework fuses features across multiple UAVs in BEV space through a four-stage pipeline (image encoding, feature mapping, fusion, and detection) and is validated on AeroCollab3D, a CARLA-based UAV dataset, and CoPerception-UAVs, showing substantial mAP gains over baselines (e.g., +4.7% on AeroCollab3D and +10% on CoPerception-UAVs). A dedicated AeroCollab3D dataset with diverse maps and high-resolution imagery supports robust evaluation under occlusion and wide-view conditions. The two proposed modules improve mapping reliability and convergence without increasing inference cost, highlighting the viability of camera-based, BEV-centric multi-UAV perception in practical UAV applications.

Abstract

Multi-UAV collaborative 3D object detection can perceive and comprehend complex environments by integrating complementary information, with applications encompassing traffic monitoring, delivery services and agricultural management. However, the extremely broad observations in aerial remote sensing and significant perspective differences across multiple UAVs make it challenging to achieve precise and consistent feature mapping from 2D images to 3D space in multi-UAV collaborative 3D object detection paradigm. To address the problem, we propose an unparalleled camera-based multi-UAV collaborative 3D object detection paradigm called UCDNet. Specifically, the depth information from the UAVs to the ground is explicitly utilized as a strong prior to provide a reference for more accurate and generalizable feature mapping. Additionally, we design a homologous points geometric consistency loss as an auxiliary self-supervision, which directly influences the feature mapping module, thereby strengthening the global consistency of multi-view perception. Experiments on AeroCollab3D and CoPerception-UAVs datasets show our method increases 4.7% and 10% mAP respectively compared to the baseline, which demonstrates the superiority of UCDNet.
Paper Structure (18 sections, 10 equations, 17 figures, 3 tables)

This paper contains 18 sections, 10 equations, 17 figures, 3 tables.

Figures (17)

  • Figure 1: The illustration of multi-UAV collaborative 3D object detection.
  • Figure 2: The overall framework of UCDNet, where Ground-Prior-Guided Feature Mapping explicitly utilize ground as a strong prior to provide the reference for more accurate and generalizable feature mapping and Homologous Point Geometric Consistency Loss is proposed as a auxiliary self-supervision, which directly influences the feature mapping network, strengthening the global consistency of multi-view perception.
  • Figure 3: UAV perspective diagram.
  • Figure 4: Schematic of the GFM module’s operational flow.
  • Figure 5: Histogram of pixel depth and depth from ground to pixel.
  • ...and 12 more figures