Table of Contents
Fetching ...

Multi-Species Object Detection in Drone Imagery for Population Monitoring of Endangered Animals

Sowmya Sankaran

TL;DR

The paper tackles the problem of counting endangered wildlife from drone imagery by fine-tuning the YOLOv8 object detector for multi-species aerial data. It employs extensive data collection (baseline and African safari datasets), large-scale hyperparameter optimization across 30 model variants, and data augmentation to achieve high accuracy, including a peak mAP-50 of 98.2% on elephants. Compared with the YOLOv8 baseline, the approach delivers up to 135x higher accuracy on safari datasets and enables real-time inference (approximately 40 fps) on the Jetson Orin Nano, demonstrating viable low-power deployment for field use. The work provides a cost-effective, edge AI solution for conservation monitoring, with public availability of datasets and logs to support reproducibility and broader deployment in wildlife refuges and by park rangers.

Abstract

Animal populations worldwide are rapidly declining, and a technology that can accurately count endangered species could be vital for monitoring population changes over several years. This research focused on fine-tuning object detection models for drone images to create accurate counts of animal species. Hundreds of images taken using a drone and large, openly available drone-image datasets were used to fine-tune machine learning models with the baseline YOLOv8 architecture. We trained 30 different models, with the largest having 43.7 million parameters and 365 layers, and used hyperparameter tuning and data augmentation techniques to improve accuracy. While the state-of-the-art YOLOv8 baseline had only 0.7% accuracy on a dataset of safari animals, our models had 95% accuracy on the same dataset. Finally, we deployed the models on the Jetson Orin Nano for demonstration of low-power real-time species detection for easy inference on drones.

Multi-Species Object Detection in Drone Imagery for Population Monitoring of Endangered Animals

TL;DR

The paper tackles the problem of counting endangered wildlife from drone imagery by fine-tuning the YOLOv8 object detector for multi-species aerial data. It employs extensive data collection (baseline and African safari datasets), large-scale hyperparameter optimization across 30 model variants, and data augmentation to achieve high accuracy, including a peak mAP-50 of 98.2% on elephants. Compared with the YOLOv8 baseline, the approach delivers up to 135x higher accuracy on safari datasets and enables real-time inference (approximately 40 fps) on the Jetson Orin Nano, demonstrating viable low-power deployment for field use. The work provides a cost-effective, edge AI solution for conservation monitoring, with public availability of datasets and logs to support reproducibility and broader deployment in wildlife refuges and by park rangers.

Abstract

Animal populations worldwide are rapidly declining, and a technology that can accurately count endangered species could be vital for monitoring population changes over several years. This research focused on fine-tuning object detection models for drone images to create accurate counts of animal species. Hundreds of images taken using a drone and large, openly available drone-image datasets were used to fine-tune machine learning models with the baseline YOLOv8 architecture. We trained 30 different models, with the largest having 43.7 million parameters and 365 layers, and used hyperparameter tuning and data augmentation techniques to improve accuracy. While the state-of-the-art YOLOv8 baseline had only 0.7% accuracy on a dataset of safari animals, our models had 95% accuracy on the same dataset. Finally, we deployed the models on the Jetson Orin Nano for demonstration of low-power real-time species detection for easy inference on drones.
Paper Structure (5 sections, 4 figures, 1 table)

This paper contains 5 sections, 4 figures, 1 table.

Figures (4)

  • Figure 1: From left, images of: Sandhill cranes (taken by us); giraffes and zebras elephantgiraffe; dogs and humans (taken by us); and elephants elephant.
  • Figure 2: Examples of the data augmentation techniques, from left: original image of Sandhill cranes; image with resize and vertical-axis flip; image with resize and bounding box flip.
  • Figure 3: Left, the model accuracy for 20 different models, with 4 model sizes trained on the baseline datasets. The 20 models are grouped in a hierarchical fashion, showing training and testing data. Right, a precision-recall curve for a model with 89.3% accuracy.
  • Figure 4: Left, a logarithmic-scale graph of YOLOv8's accuracy on datasets compared to our model's accuracy. Right, YOLOv8’s predictions on two images are on the left and our model’s predictions are on the right. YOLOv8 could not identify the elephants or the bird species.