A New Dataset and Comparative Study for Aphid Cluster Detection and Segmentation in Sorghum Fields

Raiyan Rahman; Christopher Indris; Goetz Bramesfeld; Tianxiao Zhang; Kaidong Li; Xiangyu Chen; Ivan Grijalva; Brian McCornack; Daniel Flippo; Ajay Sharda; Guanghui Wang

A New Dataset and Comparative Study for Aphid Cluster Detection and Segmentation in Sorghum Fields

Raiyan Rahman, Christopher Indris, Goetz Bramesfeld, Tianxiao Zhang, Kaidong Li, Xiangyu Chen, Ivan Grijalva, Brian McCornack, Daniel Flippo, Ajay Sharda, Guanghui Wang

TL;DR

This work tackles aphid infestation in sorghum by introducing a large, real-field, multi-scale dataset of aphid clusters and benchmarking real-time segmentation and detection models. The study demonstrates that RT-DETR excels in detection while Fast-SCNN offers the best real-time segmentation, with Small HRNet achieving the highest segmentation accuracy at lower speeds. It argues that semantic segmentation provides more informative infestation assessments through mask-based area measurements and recommends deployment strategies for precision spraying. The dataset and findings offer a practical foundation for advancing sustainable, targeted pest control in agricultural settings.

Abstract

Aphid infestations are one of the primary causes of extensive damage to wheat and sorghum fields and are one of the most common vectors for plant viruses, resulting in significant agricultural yield losses. To address this problem, farmers often employ the inefficient use of harmful chemical pesticides that have negative health and environmental impacts. As a result, a large amount of pesticide is wasted on areas without significant pest infestation. This brings to attention the urgent need for an intelligent autonomous system that can locate and spray sufficiently large infestations selectively within the complex crop canopies. We have developed a large multi-scale dataset for aphid cluster detection and segmentation, collected from actual sorghum fields and meticulously annotated to include clusters of aphids. Our dataset comprises a total of 54,742 image patches, showcasing a variety of viewpoints, diverse lighting conditions, and multiple scales, highlighting its effectiveness for real-world applications. In this study, we trained and evaluated four real-time semantic segmentation models and three object detection models specifically for aphid cluster segmentation and detection. Considering the balance between accuracy and efficiency, Fast-SCNN delivered the most effective segmentation results, achieving 80.46% mean precision, 81.21% mean recall, and 91.66 frames per second (FPS). For object detection, RT-DETR exhibited the best overall performance with a 61.63% mean average precision (mAP), 92.6% mean recall, and 72.55 on an NVIDIA V100 GPU. Our experiments further indicate that aphid cluster segmentation is more suitable for assessing aphid infestations than using detection models.

A New Dataset and Comparative Study for Aphid Cluster Detection and Segmentation in Sorghum Fields

TL;DR

Abstract

Paper Structure (14 sections, 4 equations, 8 figures, 3 tables)

This paper contains 14 sections, 4 equations, 8 figures, 3 tables.

Introduction
Related Works
Dataset
Models
Segmentation Models
Detection Models
Experimental Setup and Evaluation Metrics
Experimental Results
Segmentation
Detection
Discussion
Results Analysis
Recommendation for Aphid Infestation Control
Conclusion

Figures (8)

Figure S1: An intelligent scouting robot with onboard vision and spray system. The vision system is used to detect and localize aphid affections, and the spray system can apply precise pesticide application to the infected areas.
Figure S2: The imaging rig used to capture images. It is equipped with three adjustable GoPro cameras to take images from different heights and viewpoints.
Figure S3: Two examples of the different scales that were used in training. Image (left) shows how the original high-resolution 3647 $\times$ 2736 image was subdivided to create patches at the 0.525W $\times$ 0.525H scale (Scale 3), where W and H refer to the width and height of the original image. At this scale, the original image will yield 4 patches. Image (right) shows how the image was further subdivided to create patches at the 0.263W $\times$ 0.263H scale (Scale 2). This scale will yield 16 patches from the original image. In each case, adjacent patches were taken with an overlap of 10%.
Figure S4: Example images from the dataset alongside their corresponding ground truth labels. The first row shows the appearance of the aphid clusters and the second row has the corresponding ground truth masks overlaid on them. The first, second, and third columns show image patches at Scale 1, Scale 2, and Scale 3, respectively.
Figure S5: Histograms showing the mask area percentage across the images and the number of images per scale. The chart on the left shows the percentage of aphid cluster masks using intervals of 10% from 0% to 100%. As most images lie in the interval between 0% and 10%, the chart in the center further breaks that interval down for closer analysis. The chart on the right provides the number of images at each scale.
...and 3 more figures

A New Dataset and Comparative Study for Aphid Cluster Detection and Segmentation in Sorghum Fields

TL;DR

Abstract

A New Dataset and Comparative Study for Aphid Cluster Detection and Segmentation in Sorghum Fields

Authors

TL;DR

Abstract

Table of Contents

Figures (8)