Table of Contents
Fetching ...

Spatio-Temporal Metric-Semantic Mapping for Persistent Orchard Monitoring: Method and Dataset

Jiuzhou Lei, Ankit Prabhu, Xu Liu, Fernando Cladera, Mehrad Mortazavi, Reza Ehsani, Pratik Chaudhari, Vijay Kumar

TL;DR

This work addresses persistent orchard monitoring by introducing a 4D metric-semantic mapping framework that fuses LiDAR and RGB data for precise 3D fruit localization and employs a two-stage 4D data association to link fruit observations across growth-season sessions. The method leverages position, visual, and topology cues to significantly outperform baselines in 4D fruit association while delivering accurate fruit counts (3.1% error) and reliable size estimates (mean error ~1.1 cm), demonstrated on a new multimodal orchard dataset spanning five fruit species. A public dataset release accompanies the approach, enabling broader phenotyping and yield-estimation research. Overall, the framework enables actionable orchard insights for agro-management and supports future robotics-enabled autonomous monitoring.

Abstract

Monitoring orchards at the individual tree or fruit level throughout the growth season is crucial for plant phenotyping and horticultural resource optimization, such as chemical use and yield estimation. We present a 4D spatio-temporal metric-semantic mapping system that integrates multi-session measurements to track fruit growth over time. Our approach combines a LiDAR-RGB fusion module for 3D fruit localization with a 4D fruit association method leveraging positional, visual, and topology information for improved data association precision. Evaluated on real orchard data, our method achieves a 96.9% fruit counting accuracy for 1,790 apples across 60 trees, a mean fruit size estimation error of 1.1 cm, and a 23.7% improvement in 4D data association precision over baselines. We publicly release a multimodal dataset covering five fruit species across their growth seasons at https://4d-metric-semantic-mapping.org/

Spatio-Temporal Metric-Semantic Mapping for Persistent Orchard Monitoring: Method and Dataset

TL;DR

This work addresses persistent orchard monitoring by introducing a 4D metric-semantic mapping framework that fuses LiDAR and RGB data for precise 3D fruit localization and employs a two-stage 4D data association to link fruit observations across growth-season sessions. The method leverages position, visual, and topology cues to significantly outperform baselines in 4D fruit association while delivering accurate fruit counts (3.1% error) and reliable size estimates (mean error ~1.1 cm), demonstrated on a new multimodal orchard dataset spanning five fruit species. A public dataset release accompanies the approach, enabling broader phenotyping and yield-estimation research. Overall, the framework enables actionable orchard insights for agro-management and supports future robotics-enabled autonomous monitoring.

Abstract

Monitoring orchards at the individual tree or fruit level throughout the growth season is crucial for plant phenotyping and horticultural resource optimization, such as chemical use and yield estimation. We present a 4D spatio-temporal metric-semantic mapping system that integrates multi-session measurements to track fruit growth over time. Our approach combines a LiDAR-RGB fusion module for 3D fruit localization with a 4D fruit association method leveraging positional, visual, and topology information for improved data association precision. Evaluated on real orchard data, our method achieves a 96.9% fruit counting accuracy for 1,790 apples across 60 trees, a mean fruit size estimation error of 1.1 cm, and a 23.7% improvement in 4D data association precision over baselines. We publicly release a multimodal dataset covering five fruit species across their growth seasons at https://4d-metric-semantic-mapping.org/
Paper Structure (14 sections, 6 equations, 8 figures, 1 table, 1 algorithm)

This paper contains 14 sections, 6 equations, 8 figures, 1 table, 1 algorithm.

Figures (8)

  • Figure 1: A 4D metric-semantic map of an apple orchard from April to August. Each row represents a time session. The first panel shows the metric-semantic map. The middle panel zooms in on the 6th tree, while the right panel presents its raw RGB image. Tree skeleton meshes adtree are for visualization only.
  • Figure 2: System Diagram.Module 1 (green box): Our system takes in sensor data from the LiDAR, RGB camera, and . Module 2 (blue box): Object detection and odometry. First, we use Faster-LIO bai2022faster, a LiDAR-inertial odometry (LIO) algorithm for pose estimation and point cloud motion undistortion. Meanwhile, YOLO-v8, an instance segmentation model Jocher_YOLOv8_by_Ultralytics_2023, is used on the RGB images to detect and segment fruits. Module 3 (orange box): LiDAR-RGB fusion for fruit localization. The point cloud is back-projected into the segmentation mask to estimate the depth of a fruit. Fruit detections from different image frames are tracked using the Hungarian assignment algorithm with mask IoU as the cost. Then, we minimize the reprojection error of fruit centroids to optimize the fruit positions. Module 4 (purple box): 4D Data Association takes optimized 3D fruit landmarks and images as input. Fruits are associated across sessions using our two-stage matching algorithm, with a cost function based on position, visual, and topology information. Module 5 (cyan box): 4D metric-semantic map generation. Using the 4D data association, we can construct a 4D metric-semantic map, acquiring actionable information such as fruit counts, sizes, and positions throughout the entire growth season.
  • Figure 3: Precise monitoring of fruit growth. The left panel shows examples of our 4D spatio-temporal fruit tracking results. The right panel quantitatively shows how these fruits grow over time. The X-axis shows the date, and the Y-axis shows fruit sizes in $cm$. Different colors represent different fruits. Fruit sizes increase over time, with varying but generally similar growth rates reflected by the slope of their trendlines. The average fruit sizes are shown in a black dashed line with triangle markers. Our method automatically generates fruit tracks across multiple seasons, providing users with detailed information on each fruit, such as size, color, shape, and position over time.
  • Figure 4: Fruit count ground truth vs. estimated (Y-axis) per tree batch (X-axis). The X-axis represents the IDs of trees included in each batch, with batches containing 2 to 3 trees. For each batch, the estimated counts closely align with the human-labeled ground truth, demonstrating the accuracy of our system. Detailed statistics are provided in \ref{['table:apple_count_accuracy']}.
  • Figure 5: Fruit size ground truth vs. estimated. The mean and standard deviation of absolute errors in fruit size estimates are 1.10 $cm$ (0.43 $inch$) and 0.45 $cm$ (0.18 $inch$).
  • ...and 3 more figures