Table of Contents
Fetching ...

Walnut Detection Through Deep Learning Enhanced by Multispectral Synthetic Images

Kaiming Fu, Tong Lei, Maryia Halubok, Brian N. Bailey

TL;DR

The paper addresses the challenge of accurately detecting walnuts in orchards, where walnuts and leaves appear highly similar in RGB and NIR images. It proposes augmenting real data with synthetic images generated via a radiative-transfer-based Helios framework, using reverse ray tracing to label synthetic pixels, and trains YOLOv5 on the augmented RGB and NIR datasets. The results show substantial improvements in detection metrics for both RGB (AP rising from 73.89% to 82.68%; F1 from 72.31 to 80.56) and NIR (AP from 68.17% to 78.63%; F1 from 62.07 to 74.48), demonstrating the value of synthetic data in agricultural image analysis. The work suggests future development of a unified RGB-NIR model and scaling synthetic data to reduce dependence on real images, with practical implications for yield estimation and orchard management.

Abstract

The accurate identification of walnuts within orchards brings forth a plethora of advantages, profoundly amplifying the efficiency and productivity of walnut orchard management. Nevertheless, the unique characteristics of walnut trees, characterized by their closely resembling shapes, colors, and textures between the walnuts and leaves, present a formidable challenge in precisely distinguishing between them during the annotation process. In this study, we present a novel approach to improve walnut detection efficiency, utilizing YOLOv5 trained on an enriched image set that incorporates both real and synthetic RGB and NIR images. Our analysis comparing results from our original and augmented datasets shows clear improvements in detection when using the synthetic images.

Walnut Detection Through Deep Learning Enhanced by Multispectral Synthetic Images

TL;DR

The paper addresses the challenge of accurately detecting walnuts in orchards, where walnuts and leaves appear highly similar in RGB and NIR images. It proposes augmenting real data with synthetic images generated via a radiative-transfer-based Helios framework, using reverse ray tracing to label synthetic pixels, and trains YOLOv5 on the augmented RGB and NIR datasets. The results show substantial improvements in detection metrics for both RGB (AP rising from 73.89% to 82.68%; F1 from 72.31 to 80.56) and NIR (AP from 68.17% to 78.63%; F1 from 62.07 to 74.48), demonstrating the value of synthetic data in agricultural image analysis. The work suggests future development of a unified RGB-NIR model and scaling synthetic data to reduce dependence on real images, with practical implications for yield estimation and orchard management.

Abstract

The accurate identification of walnuts within orchards brings forth a plethora of advantages, profoundly amplifying the efficiency and productivity of walnut orchard management. Nevertheless, the unique characteristics of walnut trees, characterized by their closely resembling shapes, colors, and textures between the walnuts and leaves, present a formidable challenge in precisely distinguishing between them during the annotation process. In this study, we present a novel approach to improve walnut detection efficiency, utilizing YOLOv5 trained on an enriched image set that incorporates both real and synthetic RGB and NIR images. Our analysis comparing results from our original and augmented datasets shows clear improvements in detection when using the synthetic images.
Paper Structure (5 sections, 3 figures, 2 tables)

This paper contains 5 sections, 3 figures, 2 tables.

Figures (3)

  • Figure 1: (a) RGB image captured with Nikon B500 camera, (b) Annotated RGB image. The camera was positioned 1m to 2m away from the canopy's region of interest (ROI).
  • Figure 2: (a) RGB image captured using the SNAPSHOT multispectral camera. (b) NIR image from the same camera. NIR imagery aids in detecting walnuts that are nearly indistinguishable in RGB views. The camera was positioned between 50 cm and 2 m from the canopy's region of interest (ROI).
  • Figure 3: (a) Synthetic RGB image with auto-generated labels, (b) Synthetic NIR image with auto-generated labels. The labels in these synthetic images were directly produced by the simulator rather than manually annotated.