Table of Contents
Fetching ...

Exploring Machine Learning Engineering for Object Detection and Tracking by Unmanned Aerial Vehicle (UAV)

Aneesha Guna, Parth Ganeriwala, Siddhartha Bhattacharyya

TL;DR

This work addresses indoor SAR by building an autonomous UAV perception pipeline for object detection and tracking. It introduces the ALDOT framework for automated labeling with assurance, and combines YOLOv4 and Mask R-CNN to detect and segment objects, deployed on a Parrot Mambo for real-time tracking. A Roomba-based indoor dataset (17,983 frames) was created, bootstrapped with 20 labeled images and refined through automated labeling validated via LLN/CLT reasoning. Empirical results show high accuracy and real-time performance, demonstrating feasibility for autonomous indoor SAR tasks and suggesting future expansion to human-target detection. The approach highlights practical integration of data collection, labeling assurance, and DL-based perception on lightweight UAVs, with potential impact on safe, autonomous search operations.

Abstract

With the advancement of deep learning methods it is imperative that autonomous systems will increasingly become intelligent with the inclusion of advanced machine learning algorithms to execute a variety of autonomous operations. One such task involves the design and evaluation for a subsystem of the perception system for object detection and tracking. The challenge in the creation of software to solve the task is in discovering the need for a dataset, annotation of the dataset, selection of features, integration and refinement of existing algorithms, while evaluating performance metrics through training and testing. This research effort focuses on the development of a machine learning pipeline emphasizing the inclusion of assurance methods with increasing automation. In the process, a new dataset was created by collecting videos of moving object such as Roomba vacuum cleaner, emulating search and rescue (SAR) for indoor environment. Individual frames were extracted from the videos and labeled using a combination of manual and automated techniques. This annotated dataset was refined for accuracy by initially training it on YOLOv4. After the refinement of the dataset it was trained on a second YOLOv4 and a Mask R-CNN model, which is deployed on a Parrot Mambo drone to perform real-time object detection and tracking. Experimental results demonstrate the effectiveness of the models in accurately detecting and tracking the Roomba across multiple trials, achieving an average loss of 0.1942 and 96% accuracy.

Exploring Machine Learning Engineering for Object Detection and Tracking by Unmanned Aerial Vehicle (UAV)

TL;DR

This work addresses indoor SAR by building an autonomous UAV perception pipeline for object detection and tracking. It introduces the ALDOT framework for automated labeling with assurance, and combines YOLOv4 and Mask R-CNN to detect and segment objects, deployed on a Parrot Mambo for real-time tracking. A Roomba-based indoor dataset (17,983 frames) was created, bootstrapped with 20 labeled images and refined through automated labeling validated via LLN/CLT reasoning. Empirical results show high accuracy and real-time performance, demonstrating feasibility for autonomous indoor SAR tasks and suggesting future expansion to human-target detection. The approach highlights practical integration of data collection, labeling assurance, and DL-based perception on lightweight UAVs, with potential impact on safe, autonomous search operations.

Abstract

With the advancement of deep learning methods it is imperative that autonomous systems will increasingly become intelligent with the inclusion of advanced machine learning algorithms to execute a variety of autonomous operations. One such task involves the design and evaluation for a subsystem of the perception system for object detection and tracking. The challenge in the creation of software to solve the task is in discovering the need for a dataset, annotation of the dataset, selection of features, integration and refinement of existing algorithms, while evaluating performance metrics through training and testing. This research effort focuses on the development of a machine learning pipeline emphasizing the inclusion of assurance methods with increasing automation. In the process, a new dataset was created by collecting videos of moving object such as Roomba vacuum cleaner, emulating search and rescue (SAR) for indoor environment. Individual frames were extracted from the videos and labeled using a combination of manual and automated techniques. This annotated dataset was refined for accuracy by initially training it on YOLOv4. After the refinement of the dataset it was trained on a second YOLOv4 and a Mask R-CNN model, which is deployed on a Parrot Mambo drone to perform real-time object detection and tracking. Experimental results demonstrate the effectiveness of the models in accurately detecting and tracking the Roomba across multiple trials, achieving an average loss of 0.1942 and 96% accuracy.

Paper Structure

This paper contains 10 sections, 2 equations, 3 figures, 1 table.

Figures (3)

  • Figure 1: Flowchart of Proposed Methodology
  • Figure 2: Six instances of labeled images in dataset
  • Figure 3: Example of the drone’s screen during final testing