Table of Contents
Fetching ...

Integrating YOLO11 and Convolution Block Attention Module for Multi-Season Segmentation of Tree Trunks and Branches in Commercial Apple Orchards

Ranjan Sapkota, Manoj Karkee

TL;DR

This work tackles labor-intensive tasks in apple orchards by developing a year-round perception system for trunk and branch segmentation. It integrates Convolutional Block Attention Module (CBAM) with YOLO11, training on a mixed-dormant and canopy-season dataset to achieve robust instance segmentation across seasons. Results show configuration- and class-dependent gains, with trunk segmentation reaching up to about 0.83 precision during training and high mAP@50 in dormant validation, while canopy validation demonstrates reliable performance under dense foliage. The approach has practical implications for autonomous pruning, thinning, and harvesting robots, though the authors note the need for larger seasonal datasets and cross-season image registration to further improve year-round reliability.

Abstract

In this study, we developed a customized instance segmentation model by integrating the Convolutional Block Attention Module (CBAM) with the YOLO11 architecture. This model, trained on a mixed dataset of dormant and canopy season apple orchard images, aimed to enhance the segmentation of tree trunks and branches under varying seasonal conditions throughout the year. The model was individually validated across dormant and canopy season images after training the YOLO11-CBAM on the mixed dataset collected over the two seasons. Additional testing of the model during pre-bloom, flower bloom, fruit thinning, and harvest season was performed. The highest recall and precision metrics were observed in the YOLO11x-seg-CBAM and YOLO11m-seg-CBAM respectively. Particularly, YOLO11m-seg with CBAM showed the highest precision of 0.83 as performed for the Trunk class in training, while without the CBAM, YOLO11m-seg achieved 0.80 precision score for the Trunk class. Likewise, for branch class, YOLO11m-seg with CBAM achieved the highest precision score value of 0.75 while without the CBAM, the YOLO11m-seg achieved a precision of 0.73. For dormant season validation, YOLO11x-seg exhibited the highest precision at 0.91. Canopy season validation highlighted YOLO11s-seg with superior precision across all classes, achieving 0.516 for Branch, and 0.64 for Trunk. The modeling approach, trained on two season datasets as dormant and canopy season images, demonstrated the potential of the YOLO11-CBAM integration to effectively detect and segment tree trunks and branches year-round across all seasonal variations. Keywords: YOLOv11, YOLOv11 Tree Detection, YOLOv11 Branch Detection and Segmentation, Machine Vision, Deep Learning, Machine Learning

Integrating YOLO11 and Convolution Block Attention Module for Multi-Season Segmentation of Tree Trunks and Branches in Commercial Apple Orchards

TL;DR

This work tackles labor-intensive tasks in apple orchards by developing a year-round perception system for trunk and branch segmentation. It integrates Convolutional Block Attention Module (CBAM) with YOLO11, training on a mixed-dormant and canopy-season dataset to achieve robust instance segmentation across seasons. Results show configuration- and class-dependent gains, with trunk segmentation reaching up to about 0.83 precision during training and high mAP@50 in dormant validation, while canopy validation demonstrates reliable performance under dense foliage. The approach has practical implications for autonomous pruning, thinning, and harvesting robots, though the authors note the need for larger seasonal datasets and cross-season image registration to further improve year-round reliability.

Abstract

In this study, we developed a customized instance segmentation model by integrating the Convolutional Block Attention Module (CBAM) with the YOLO11 architecture. This model, trained on a mixed dataset of dormant and canopy season apple orchard images, aimed to enhance the segmentation of tree trunks and branches under varying seasonal conditions throughout the year. The model was individually validated across dormant and canopy season images after training the YOLO11-CBAM on the mixed dataset collected over the two seasons. Additional testing of the model during pre-bloom, flower bloom, fruit thinning, and harvest season was performed. The highest recall and precision metrics were observed in the YOLO11x-seg-CBAM and YOLO11m-seg-CBAM respectively. Particularly, YOLO11m-seg with CBAM showed the highest precision of 0.83 as performed for the Trunk class in training, while without the CBAM, YOLO11m-seg achieved 0.80 precision score for the Trunk class. Likewise, for branch class, YOLO11m-seg with CBAM achieved the highest precision score value of 0.75 while without the CBAM, the YOLO11m-seg achieved a precision of 0.73. For dormant season validation, YOLO11x-seg exhibited the highest precision at 0.91. Canopy season validation highlighted YOLO11s-seg with superior precision across all classes, achieving 0.516 for Branch, and 0.64 for Trunk. The modeling approach, trained on two season datasets as dormant and canopy season images, demonstrated the potential of the YOLO11-CBAM integration to effectively detect and segment tree trunks and branches year-round across all seasonal variations. Keywords: YOLOv11, YOLOv11 Tree Detection, YOLOv11 Branch Detection and Segmentation, Machine Vision, Deep Learning, Machine Learning

Paper Structure

This paper contains 26 sections, 4 equations, 16 figures, 3 tables.

Figures (16)

  • Figure 2: ) Displays dormant season (December/January), showcasing essential winter pruning for optimal tree health and structure. b) Highlights blossom season (March/April), with manual training of shoots on trellis wires for commercial tree architecture. c) Depicts the thinning of excess green fruits in June/July, critical for quality fruit development. d) Shows summer pruning in August to enhance canopy airflow and light distribution. e) Illustrates the labor-intensive apple harvest in October.
  • Figure 3: Showing the Image Acquisition and Methodology workflow: a) Dormant season image collection using Microsoft Azure machine vision camera ; b) Canopy season image collection using Microsoft Azure machine vision camera ; c) Workflow diagram
  • Figure 4: Showing the Image Labelling a) Dormant season image labelling into trunks and branch; b) Canopy season image into trunk and branch ; c) Architecture diagram of YOLO11 sapkota2024comparingsapkota2024comprehensivesapkota2024yolo11 fusion implemented in this study
  • Figure 5: Distinct seasonal conditions tested commercial apple orchard: a) Pre-blossom season image collection setup showing early branch development; b) Flower bloom season with active pollination and flower thinning activities; c) Green fruit thinning season highlighting the dense fruitlet clusters needing thinning; d) Harvest season image collection in ripe apples, ready for market.
  • Figure 6: Result Examples: (a) Depicts a dormant season scene where the YOLO11 model effectively segments a branch (red arrow) but misses another (yellow arrow). (b) Demonstrates the model's precision in avoiding false trunk identification (red arrow) and highlights missed branches (yellow arrows) in a dormant season context. (c) Shows areas where the model failed to detect prominent branches in the foreground (yellow arrows). (d) Displays a canopy season image, identifying most trunks and branches accurately, though some are missed (yellow arrows). (e) Illustrates successful branch segmentation in dense canopy foliage (red arrows). (f) Highlights accurate trunk segmentation amidst complex canopy coverage (red arrows).
  • ...and 11 more figures