UAV-Based Human Body Detector Selection and Fusion for Geolocated Saliency Map Generation
Piotr Rudol, Patrick Doherty, Mariusz Wzorek, Chattrakul Sombattheera
TL;DR
Geolocating objects in outdoor SAR missions using multi-UAV teams under bandwidth and compute constraints is challenging. The paper proposes an end-to-end solution comprising offline evaluation of vision detectors across codecs, an ILP-based detector allocation strategy, and a probabilistic fusion framework to generate saliency maps that yield 3D salient locations. Key contributions include a detector evaluation protocol with AP/AR and LRP metrics under limited bitrate, a formal ILP model with variables $x_{vdb}$ and $y_{vdf}$ and objective $acc_{obj}$, and a log-odds grid fusion approach that accounts for detector reliability and geometry. The approach enables timely, reliable SAR operations by optimally distributing detectors across a heterogeneous UAV network and fusing detections into actionable, geolocated saliency maps, validated through simulations and real flights.
Abstract
The problem of reliably detecting and geolocating objects of different classes in soft real-time is essential in many application areas, such as Search and Rescue performed using Unmanned Aerial Vehicles (UAVs). This research addresses the complementary problems of system contextual vision-based detector selection, allocation, and execution, in addition to the fusion of detection results from teams of UAVs for the purpose of accurately and reliably geolocating objects of interest in a timely manner. In an offline step, an application-independent evaluation of vision-based detectors from a system perspective is first performed. Based on this evaluation, the most appropriate algorithms for online object detection for each platform are selected automatically before a mission, taking into account a number of practical system considerations, such as the available communication links, video compression used, and the available computational resources. The detection results are fused using a method for building maps of salient locations which takes advantage of a novel sensor model for vision-based detections for both positive and negative observations. A number of simulated and real flight experiments are also presented, validating the proposed method.
