Enhancing Pollinator Conservation towards Agriculture 4.0: Monitoring of Bees through Object Recognition
Ajay John Alex, Chloe M. Barnes, Pedro Machado, Isibor Ihianle, Gábor Markó, Martin Bencsik, Jordan J. Bird
TL;DR
This work addresses the challenge of monitoring pollinator populations under climate change by leveraging vision-based object detection. It introduces a large open dataset, Bee Detection in the Wild (9,664 images with 13,402 bounding boxes), and benchmarks YOLO-based detectors (YOLOv5s, YOLOv5m, and variants) with data augmentation, reporting that YOLOv5m achieves the best accuracy while YOLOv5s delivers superior real-time performance. An explainable AI interface translates detections into timestamped reports for non-technical stakeholders, bridging research with practical beekeeping and conservation needs. The study demonstrates the feasibility of real-time, accessible pollinator monitoring aligned with Agriculture 4.0 and outlines future directions, including richer augmentation, multi-species and behavior detection, and edge/cloud deployment to enhance impact.
Abstract
In an era of rapid climate change and its adverse effects on food production, technological intervention to monitor pollinator conservation is of paramount importance for environmental monitoring and conservation for global food security. The survival of the human species depends on the conservation of pollinators. This article explores the use of Computer Vision and Object Recognition to autonomously track and report bee behaviour from images. A novel dataset of 9664 images containing bees is extracted from video streams and annotated with bounding boxes. With training, validation and testing sets (6722, 1915, and 997 images, respectively), the results of the COCO-based YOLO model fine-tuning approaches show that YOLOv5m is the most effective approach in terms of recognition accuracy. However, YOLOv5s was shown to be the most optimal for real-time bee detection with an average processing and inference time of 5.1ms per video frame at the cost of slightly lower ability. The trained model is then packaged within an explainable AI interface, which converts detection events into timestamped reports and charts, with the aim of facilitating use by non-technical users such as expert stakeholders from the apiculture industry towards informing responsible consumption and production.
