Table of Contents
Fetching ...

AgRegNet: A Deep Regression Network for Flower and Fruit Density Estimation, Localization, and Counting in Orchards

Uddhav Bhattarai, Santosh Bhusal, Qin Zhang, Manoj Karkee

TL;DR

AgRegNet introduces a segmentation-assisted regression framework that estimates flower and fruit density, counts, and centroid localization in unstructured orchards using point annotations rather than precise object boundaries. Built as a U-Net–like network with a modified ConvNeXt-T encoder and a segmentation gate, it delivers density maps that enable accurate counting and peak-based localization via Hungarian matching, while maintaining a lightweight footprint. The approach achieves high density-map fidelity (SSIM ≈ 0.94 for flowers, ≈ 0.91 for fruits) and strong counting/localization performance, with notable improvements from the segmentation branch and CBAM attention. Practically, AgRegNet offers rapid inference (≈14 ms), reduced annotation effort, and direct applicability to thinning, yield estimation, and labor planning in diverse orchard contexts, with clear avenues for multi-view robustness and broader crop coverage.

Abstract

One of the major challenges for the agricultural industry today is the uncertainty in manual labor availability and the associated cost. Automated flower and fruit density estimation, localization, and counting could help streamline harvesting, yield estimation, and crop-load management strategies such as flower and fruitlet thinning. This article proposes a deep regression-based network, AgRegNet, to estimate density, count, and location of flower and fruit in tree fruit canopies without explicit object detection or polygon annotation. Inspired by popular U-Net architecture, AgRegNet is a U-shaped network with an encoder-to-decoder skip connection and modified ConvNeXt-T as an encoder feature extractor. AgRegNet can be trained based on information from point annotation and leverages segmentation information and attention modules (spatial and channel) to highlight relevant flower and fruit features while suppressing non-relevant background features. Experimental evaluation in apple flower and fruit canopy images under an unstructured orchard environment showed that AgRegNet achieved promising accuracy as measured by Structural Similarity Index (SSIM), percentage Mean Absolute Error (pMAE) and mean Average Precision (mAP) to estimate flower and fruit density, count, and centroid location, respectively. Specifically, the SSIM, pMAE, and mAP values for flower images were 0.938, 13.7%, and 0.81, respectively. For fruit images, the corresponding values were 0.910, 5.6%, and 0.93. Since the proposed approach relies on information from point annotation, it is suitable for sparsely and densely located objects. This simplified technique will be highly applicable for growers to accurately estimate yields and decide on optimal chemical and mechanical flower thinning practices.

AgRegNet: A Deep Regression Network for Flower and Fruit Density Estimation, Localization, and Counting in Orchards

TL;DR

AgRegNet introduces a segmentation-assisted regression framework that estimates flower and fruit density, counts, and centroid localization in unstructured orchards using point annotations rather than precise object boundaries. Built as a U-Net–like network with a modified ConvNeXt-T encoder and a segmentation gate, it delivers density maps that enable accurate counting and peak-based localization via Hungarian matching, while maintaining a lightweight footprint. The approach achieves high density-map fidelity (SSIM ≈ 0.94 for flowers, ≈ 0.91 for fruits) and strong counting/localization performance, with notable improvements from the segmentation branch and CBAM attention. Practically, AgRegNet offers rapid inference (≈14 ms), reduced annotation effort, and direct applicability to thinning, yield estimation, and labor planning in diverse orchard contexts, with clear avenues for multi-view robustness and broader crop coverage.

Abstract

One of the major challenges for the agricultural industry today is the uncertainty in manual labor availability and the associated cost. Automated flower and fruit density estimation, localization, and counting could help streamline harvesting, yield estimation, and crop-load management strategies such as flower and fruitlet thinning. This article proposes a deep regression-based network, AgRegNet, to estimate density, count, and location of flower and fruit in tree fruit canopies without explicit object detection or polygon annotation. Inspired by popular U-Net architecture, AgRegNet is a U-shaped network with an encoder-to-decoder skip connection and modified ConvNeXt-T as an encoder feature extractor. AgRegNet can be trained based on information from point annotation and leverages segmentation information and attention modules (spatial and channel) to highlight relevant flower and fruit features while suppressing non-relevant background features. Experimental evaluation in apple flower and fruit canopy images under an unstructured orchard environment showed that AgRegNet achieved promising accuracy as measured by Structural Similarity Index (SSIM), percentage Mean Absolute Error (pMAE) and mean Average Precision (mAP) to estimate flower and fruit density, count, and centroid location, respectively. Specifically, the SSIM, pMAE, and mAP values for flower images were 0.938, 13.7%, and 0.81, respectively. For fruit images, the corresponding values were 0.910, 5.6%, and 0.93. Since the proposed approach relies on information from point annotation, it is suitable for sparsely and densely located objects. This simplified technique will be highly applicable for growers to accurately estimate yields and decide on optimal chemical and mechanical flower thinning practices.
Paper Structure (24 sections, 7 equations, 8 figures, 5 tables, 1 algorithm)

This paper contains 24 sections, 7 equations, 8 figures, 5 tables, 1 algorithm.

Figures (8)

  • Figure 1: Example apple flower and fruit dataset used in this work. Apple flower dataset consisted of images acquired in two blooming stages (majority of flowers unopened and full bloom) in two apple canopy training architectures (Vertical wall, V trellis). Fruit images publicly available from gao2020multi were annotated and evaluated.
  • Figure 2: AgRegNet: The proposed approach for flower and fruit density estimation in agriculture dataset. The network has U-Net inspired encoder-decoder framework with modified ConvNeXt - T in the encoder. A segmentation branch was added to transmit relevant features and suppress background information
  • Figure 3: a) Convolutional Block Attention Module (CBAM)woo2018cbam. $\otimes$ represents elementwise multiplication; and b) Residual connection block in proposed modified ConvNeXt-T architecture
  • Figure 4: Featuremap processing in each decoder block
  • Figure 5: Qualitative comparison of the density maps of the proposed AgRegNet with other approaches
  • ...and 3 more figures