Precision yield estimation and mapping in manual strawberry harvesting with instrumented picking carts and a robust data processing pipeline
Uddhav Bhattarai, Rajkishan Arikapudi, Chen Peng, Steven A. Fennimore, Frank N Martin, Stavros G. Vougioukas
TL;DR
The paper tackles the challenge of generating high-resolution yield maps for manually harvested crops by integrating instrumented picking carts with a robust data-processing pipeline. It combines low-cost SBAS-based GPS, IMU, and load cells with a CNN-LSTM filter, DBSCAN row assignment, and VIOLATION-correcting algorithms to produce accurate yield distributions and maps at fine spatial resolution. Key contributions include the iCarrito platform, a six-step yield-estimation pipeline, and ground-truth validation showing row-segment accuracy of $90.48\%$ and tray-level accuracy of $94.05\%$, along with a strong season-long tray-count correlation of $r=0.99$. The work demonstrates a scalable, practical approach for precision management in specialty crops and enables improved field and labor management in commercial strawberry production, with potential extension to other crops and sensing modalities.
Abstract
High-resolution yield maps for manually harvested crops are impractical to generate on commercial scales because yield monitors are available only for mechanical harvesters. However, precision crop management relies on accurately determining spatial and temporal yield variability. This study presents the development of an integrated system for precision yield estimation and mapping for manually harvested strawberries. Conventional strawberry picking carts were instrumented with a Global Positioning System (GPS) receiver, an Inertial Measurement Unit (IMU), and load cells to record real-time geo-tagged harvest data and cart motion. Extensive data were collected in two strawberry fields in California, USA, during a harvest season. To address the inconsistencies and errors caused by the sensors and the manual harvesting process, a robust data processing pipeline was developed by integrating supervised deep learning models with unsupervised algorithms. The pipeline was used to estimate the yield distribution and generate yield maps for season-long harvests at the desired grid resolution. The estimated yield distributions were used to calculate two metrics: the total mass harvested over specific row segments and the total mass of trays harvested. The metrics were compared to ground truth and achieved accuracies of 90.48% and 94.05%, respectively. Additionally, the accuracy of the estimated yield based on the number of trays harvested per cart for season-long harvest was better than 94%. It showed a strong correlation (Pearson r = 0.99) with the actual number of counted trays in both fields. The proposed system provides a scalable and practical solution for specialty crops, assisting in efficient yield estimation and mapping, field management, and labor management for sustainable crop production.
