Investigation to answer three key questions concerning plant pest identification and development of a practical identification framework
Ryosuke Wayama, Yuki Sasaki, Satoshi Kagiwada, Nobusuke Iwasaki, Hitoshi Iyatomi
TL;DR
This study tackles image-based plant pest identification by addressing three key questions about data-source separation, ROI pre-detection, and class-definition strategies. It proposes a two-stage framework that first detects ROIs with ShapeMask and then classifies using EfficientNet-based CNNs, trained on a large, diverse, field-collected dataset. Results show that rigorous data separation, ROI-focused preprocessing, and pest/crop integration with cross-crop training yield high practical performance, achieving 91.0% accuracy and 88.5% macro F1 on unseen-field data across 21 classes, with per-class F1 mostly above 80%. The framework operates at about 476 ms per case and uses a public dataset, supporting deployment in real-world pest diagnosis and potential smartphone-assisted usage.
Abstract
The development of practical and robust automated diagnostic systems for identifying plant pests is crucial for efficient agricultural production. In this paper, we first investigate three key research questions (RQs) that have not been addressed thus far in the field of image-based plant pest identification. Based on the knowledge gained, we then develop an accurate, robust, and fast plant pest identification framework using 334K images comprising 78 combinations of four plant portions (the leaf front, leaf back, fruit, and flower of cucumber, tomato, strawberry, and eggplant) and 20 pest species captured at 27 farms. The results reveal the following. (1) For an appropriate evaluation of the model, the test data should not include images of the field from which the training images were collected, or other considerations to increase the diversity of the test set should be taken into account. (2) Pre-extraction of ROIs, such as leaves and fruits, helps to improve identification accuracy. (3) Integration of closely related species using the same control methods and cross-crop training methods for the same pests, are effective. Our two-stage plant pest identification framework, enabling ROI detection and convolutional neural network (CNN)-based identification, achieved a highly practical performance of 91.0% and 88.5% in mean accuracy and macro F1 score, respectively, for 12,223 instances of test data of 21 classes collected from unseen fields, where 25 classes of images from 318,971 samples were used for training; the average identification time was 476 ms/image.
