Table of Contents
Fetching ...

BloomNet: Exploring Single vs. Multiple Object Annotation for Flower Recognition Using YOLO Variants

Safwat Nusrat, Prithwiraj Bhattacharjee

TL;DR

This work tackles robust flower detection under realistic density conditions by comparing multiple YOLO variants on a dense, multispecies dataset. It introduces FloralSix (2816 high-resolution images across six species) and evaluates SISBB and SIMBB regimes across YOLOv5s, YOLOv8n/s/m, and YOLOv12n using SGD and AdamW optimizers, reporting $mAP$ metrics such as $mAP@0.5$ and $mAP@0.5:0.95$. Key findings show density and optimization choice shape performance: SISBB favors precision-heavy detectors (e.g., YOLOv8m with SGD), while SIMBB benefits from density-generalist models (e.g., YOLOv12n with SGD); SGD outperforms AdamW consistently. The study provides a density-aware baseline and practical guidance for deploying real-time flower detection in precision agriculture, including UAV-based monitoring and robotic pollination tasks.

Abstract

Precise localization and recognition of flowers are crucial for advancing automated agriculture, particularly in plant phenotyping, crop estimation, and yield monitoring. This paper benchmarks several YOLO architectures such as YOLOv5s, YOLOv8n/s/m, and YOLOv12n for flower object detection under two annotation regimes: single-image single-bounding box (SISBB) and single-image multiple-bounding box (SIMBB). The FloralSix dataset, comprising 2,816 high-resolution photos of six different flower species, is also introduced. It is annotated for both dense (clustered) and sparse (isolated) scenarios. The models were evaluated using Precision, Recall, and Mean Average Precision (mAP) at IoU thresholds of 0.5 (mAP@0.5) and 0.5-0.95 (mAP@0.5:0.95). In SISBB, YOLOv8m (SGD) achieved the best results with Precision 0.956, Recall 0.951, mAP@0.5 0.978, and mAP@0.5:0.95 0.865, illustrating strong accuracy in detecting isolated flowers. With mAP@0.5 0.934 and mAP@0.5:0.95 0.752, YOLOv12n (SGD) outperformed the more complicated SIMBB scenario, proving robustness in dense, multi-object detection. Results show how annotation density, IoU thresholds, and model size interact: recall-optimized models perform better in crowded environments, whereas precision-oriented models perform best in sparse scenarios. In both cases, the Stochastic Gradient Descent (SGD) optimizer consistently performed better than alternatives. These density-sensitive sensors are helpful for non-destructive crop analysis, growth tracking, robotic pollination, and stress evaluation.

BloomNet: Exploring Single vs. Multiple Object Annotation for Flower Recognition Using YOLO Variants

TL;DR

This work tackles robust flower detection under realistic density conditions by comparing multiple YOLO variants on a dense, multispecies dataset. It introduces FloralSix (2816 high-resolution images across six species) and evaluates SISBB and SIMBB regimes across YOLOv5s, YOLOv8n/s/m, and YOLOv12n using SGD and AdamW optimizers, reporting metrics such as and . Key findings show density and optimization choice shape performance: SISBB favors precision-heavy detectors (e.g., YOLOv8m with SGD), while SIMBB benefits from density-generalist models (e.g., YOLOv12n with SGD); SGD outperforms AdamW consistently. The study provides a density-aware baseline and practical guidance for deploying real-time flower detection in precision agriculture, including UAV-based monitoring and robotic pollination tasks.

Abstract

Precise localization and recognition of flowers are crucial for advancing automated agriculture, particularly in plant phenotyping, crop estimation, and yield monitoring. This paper benchmarks several YOLO architectures such as YOLOv5s, YOLOv8n/s/m, and YOLOv12n for flower object detection under two annotation regimes: single-image single-bounding box (SISBB) and single-image multiple-bounding box (SIMBB). The FloralSix dataset, comprising 2,816 high-resolution photos of six different flower species, is also introduced. It is annotated for both dense (clustered) and sparse (isolated) scenarios. The models were evaluated using Precision, Recall, and Mean Average Precision (mAP) at IoU thresholds of 0.5 (mAP@0.5) and 0.5-0.95 (mAP@0.5:0.95). In SISBB, YOLOv8m (SGD) achieved the best results with Precision 0.956, Recall 0.951, mAP@0.5 0.978, and mAP@0.5:0.95 0.865, illustrating strong accuracy in detecting isolated flowers. With mAP@0.5 0.934 and mAP@0.5:0.95 0.752, YOLOv12n (SGD) outperformed the more complicated SIMBB scenario, proving robustness in dense, multi-object detection. Results show how annotation density, IoU thresholds, and model size interact: recall-optimized models perform better in crowded environments, whereas precision-oriented models perform best in sparse scenarios. In both cases, the Stochastic Gradient Descent (SGD) optimizer consistently performed better than alternatives. These density-sensitive sensors are helpful for non-destructive crop analysis, growth tracking, robotic pollination, and stress evaluation.
Paper Structure (15 sections, 2 figures, 3 tables)

This paper contains 15 sections, 2 figures, 3 tables.

Figures (2)

  • Figure 1: Model architecture for flower detection using YOLO variants.
  • Figure 2: Comparison of object detection results under different annotation strategies: (a) SISBB (Single Image Single Bounding Box) and (b) SIMBB (Single Image Multiple Bounding Boxes). Ground-truth boxes are shown in orange, while predicted boxes are shown in blue.