Real-Time On-the-Go Annotation Framework Using YOLO for Automated Dataset Generation
Mohamed Abdallah Salem, Ahmed Harb Rabia
TL;DR
Manual annotation significantly limits rapid deployment of YOLO-based detectors in precision agriculture. The authors introduce an edge-enabled, real-time on-the-go annotation framework that performs live detection and YOLO-format labeling during image capture, enabling immediate dataset expansion. Through 12 training configurations across YOLOv5, V8, and V12 on a weed-crop dataset, they show pretrained and single-class approaches yield faster convergence, higher accuracy, and robust field performance, with YOLOv12 often delivering the best results while YOLOv8 offers faster inference. Real-world field tests confirm the framework's practicality for in-field labeling and rapid data collection, supporting scalable, timely model refinement in dynamic agricultural environments.
Abstract
Efficient and accurate annotation of datasets remains a significant challenge for deploying object detection models such as You Only Look Once (YOLO) in real-world applications, particularly in agriculture where rapid decision-making is critical. Traditional annotation techniques are labor-intensive, requiring extensive manual labeling post data collection. This paper presents a novel real-time annotation approach leveraging YOLO models deployed on edge devices, enabling immediate labeling during image capture. To comprehensively evaluate the efficiency and accuracy of our proposed system, we conducted an extensive comparative analysis using three prominent YOLO architectures (YOLOv5, YOLOv8, YOLOv12) under various configurations: single-class versus multi-class annotation and pretrained versus scratch-based training. Our analysis includes detailed statistical tests and learning dynamics, demonstrating significant advantages of pretrained and single-class configurations in terms of model convergence, performance, and robustness. Results strongly validate the feasibility and effectiveness of our real-time annotation framework, highlighting its capability to drastically reduce dataset preparation time while maintaining high annotation quality.
