Table of Contents
Fetching ...

Hierarchical Object Detection and Recognition Framework for Practical Plant Disease Diagnosis

Kohei Iwano, Shogo Shibuya, Satoshi Kagiwada, Hitoshi Iyatomi

TL;DR

The paper tackles the challenge of robust plant disease diagnosis from field images, where labeling costs and the need to recognize healthy cases hinder single-stage methods. It proposes a hierarchical object detection and recognition framework (HODRF) that first uses OD to locate ROIs and then applies CL to diagnose diseases around those ROIs, effectively combining the strengths of both approaches. Using YOLOv7-w6 and EfficientNetV2-s on a large, cross-field dataset with 21 classes across four crops, HODRF achieves substantial improvements in healthy-case detection and macro F1 scores compared to stand-alone baselines, demonstrating practical gains in real-world conditions. The results suggest that ROI-guided CL can mitigate OD over-detection of healthy tissue and counter domain shift, offering a scalable, cost-effective path toward reliable plant-disease diagnosis, albeit with limitations in cases of large domain shifts and multiple concurrent infections.

Abstract

Recently, object detection methods (OD; e.g., YOLO-based models) have been widely utilized in plant disease diagnosis. These methods demonstrate robustness to distance variations and excel at detecting small lesions compared to classification methods (CL; e.g., CNN models). However, there are issues such as low diagnostic performance for hard-to-detect diseases and high labeling costs. Additionally, since healthy cases cannot be explicitly trained, there is a risk of false positives. We propose the Hierarchical object detection and recognition framework (HODRF), a sophisticated and highly integrated two-stage system that combines the strengths of both OD and CL for plant disease diagnosis. In the first stage, HODRF uses OD to identify regions of interest (ROIs) without specifying the disease. In the second stage, CL diagnoses diseases surrounding the ROIs. HODRF offers several advantages: (1) Since OD detects only one type of ROI, HODRF can detect diseases with limited training images by leveraging its ability to identify other lesions. (2) While OD over-detects healthy cases, HODRF significantly reduces these errors by using CL in the second stage. (3) CL's accuracy improves in HODRF as it identifies diagnostic targets given as ROIs, making it less vulnerable to size changes. (4) HODRF benefits from CL's lower annotation costs, allowing it to learn from a larger number of images. We implemented HODRF using YOLOv7 for OD and EfficientNetV2 for CL and evaluated its performance on a large-scale dataset (4 crops, 20 diseased and healthy classes, 281K images). HODRF outperformed YOLOv7 alone by 5.8 to 21.5 points on healthy data and 0.6 to 7.5 points on macro F1 scores, and it improved macro F1 by 1.1 to 7.2 points over EfficientNetV2.

Hierarchical Object Detection and Recognition Framework for Practical Plant Disease Diagnosis

TL;DR

The paper tackles the challenge of robust plant disease diagnosis from field images, where labeling costs and the need to recognize healthy cases hinder single-stage methods. It proposes a hierarchical object detection and recognition framework (HODRF) that first uses OD to locate ROIs and then applies CL to diagnose diseases around those ROIs, effectively combining the strengths of both approaches. Using YOLOv7-w6 and EfficientNetV2-s on a large, cross-field dataset with 21 classes across four crops, HODRF achieves substantial improvements in healthy-case detection and macro F1 scores compared to stand-alone baselines, demonstrating practical gains in real-world conditions. The results suggest that ROI-guided CL can mitigate OD over-detection of healthy tissue and counter domain shift, offering a scalable, cost-effective path toward reliable plant-disease diagnosis, albeit with limitations in cases of large domain shifts and multiple concurrent infections.

Abstract

Recently, object detection methods (OD; e.g., YOLO-based models) have been widely utilized in plant disease diagnosis. These methods demonstrate robustness to distance variations and excel at detecting small lesions compared to classification methods (CL; e.g., CNN models). However, there are issues such as low diagnostic performance for hard-to-detect diseases and high labeling costs. Additionally, since healthy cases cannot be explicitly trained, there is a risk of false positives. We propose the Hierarchical object detection and recognition framework (HODRF), a sophisticated and highly integrated two-stage system that combines the strengths of both OD and CL for plant disease diagnosis. In the first stage, HODRF uses OD to identify regions of interest (ROIs) without specifying the disease. In the second stage, CL diagnoses diseases surrounding the ROIs. HODRF offers several advantages: (1) Since OD detects only one type of ROI, HODRF can detect diseases with limited training images by leveraging its ability to identify other lesions. (2) While OD over-detects healthy cases, HODRF significantly reduces these errors by using CL in the second stage. (3) CL's accuracy improves in HODRF as it identifies diagnostic targets given as ROIs, making it less vulnerable to size changes. (4) HODRF benefits from CL's lower annotation costs, allowing it to learn from a larger number of images. We implemented HODRF using YOLOv7 for OD and EfficientNetV2 for CL and evaluated its performance on a large-scale dataset (4 crops, 20 diseased and healthy classes, 281K images). HODRF outperformed YOLOv7 alone by 5.8 to 21.5 points on healthy data and 0.6 to 7.5 points on macro F1 scores, and it improved macro F1 by 1.1 to 7.2 points over EfficientNetV2.
Paper Structure (11 sections, 5 figures, 2 tables)

This paper contains 11 sections, 5 figures, 2 tables.

Figures (5)

  • Figure 1: Diagram of the HODRF
  • Figure 2: ROI detected in the ROI detection stage
  • Figure 3: Detected diagnosis target based on ROI
  • Figure 4: Example image trained for the CL
  • Figure 6: Example of the final diagnostic results of the three systems of the evaluation data set for a field different from the training fields.