A Comprehensive Evaluation of YOLO-based Deer Detection Performance on Edge Devices
Bishal Adhikari, Jiajia Li, Eric S. Michel, Jacob Dykes, Te-Ming Paul Tseng, Mary Love Tagert, Dong Chen
TL;DR
This work tackles the practical challenge of real-time deer detection at the edge by introducing a public deer camera-trap dataset and a comprehensive benchmark of YOLOv8–YOLOv11 variants across CPU-based Raspberry Pi 5 and GPU-accelerated Jetson AGX Xavier. It demonstrates that while high-accuracy models perform well on high-end GPUs, CPU-only edge devices struggle to achieve real-time performance, whereas Jetson provides robust real-time throughput with AP@0.5 above 0.85 for several small to medium models. The study highlights domain gaps between clean laboratory-like data and challenging real-world camera-trap conditions, emphasizes the need for hardware-aware optimization, and identifies specific lightweight models (e.g., YOLOv11n, YOLOv8s, YOLOv9s) that balance accuracy and efficiency. The findings offer practical guidance for deploying autonomous deer-detection and deterrence systems in edge environments, with future work focusing on optimization techniques and dataset expansion to improve robustness across conditions and species.
Abstract
The escalating economic losses in agriculture due to deer intrusion, estimated to be in the hundreds of millions of dollars annually in the U.S., highlight the inadequacy of traditional mitigation strategies such as hunting, fencing, use of repellents, and scare tactics. This underscores a critical need for intelligent, autonomous solutions capable of real-time deer detection and deterrence. But the progress in this field is impeded by a significant gap in the literature, mainly the lack of a domain-specific, practical dataset and limited study on the viability of deer detection systems on edge devices. To address this gap, this study presents a comprehensive evaluation of state-of-the-art deep learning models for deer detection in challenging real-world scenarios. We introduce a curated, publicly available dataset of 3,095 annotated images with bounding box annotations of deer. Then, we provide an extensive comparative analysis of 12 model variants across four recent YOLO architectures (v8 to v11). Finally, we evaluated their performance on two representative edge computing platforms: the CPU-based Raspberry Pi 5 and the GPU-accelerated NVIDIA Jetson AGX Xavier to assess feasibility for real-world field deployment. Results show that the real-time detection performance is not feasible on Raspberry Pi without hardware-specific model optimization, while NVIDIA Jetson provides greater than 30 frames per second (FPS) with 's' and 'n' series models. This study also reveals that smaller, architecturally advanced models such as YOLOv11n, YOLOv8s, and YOLOv9s offer the optimal balance of high accuracy (Average Precision (AP) > 0.85) and computational efficiency (Inference Time < 34 milliseconds).
