Table of Contents
Fetching ...

Autonomous Robotic Pepper Harvesting: Imitation Learning in Unstructured Agricultural Environments

Chung Hee Kim, Abhisesh Silwal, George Kantor

Abstract

Automating tasks in outdoor agricultural fields poses significant challenges due to environmental variability, unstructured terrain, and diverse crop characteristics. We present a robotic system for autonomous pepper harvesting designed to operate in these unprotected, complex settings. Utilizing a custom handheld shear-gripper, we collected 300 demonstrations to train a visuomotor policy, enabling the system to adapt to varying field conditions and crop diversity. We achieved a success rate of 28.95% with a cycle time of 31.71 seconds, comparable to existing systems tested under more controlled conditions like greenhouses. Our system demonstrates the feasibility and effectiveness of leveraging imitation learning for automated harvesting in unstructured agricultural environments. This work aims to advance scalable, automated robotic solutions for agriculture in natural settings.

Autonomous Robotic Pepper Harvesting: Imitation Learning in Unstructured Agricultural Environments

Abstract

Automating tasks in outdoor agricultural fields poses significant challenges due to environmental variability, unstructured terrain, and diverse crop characteristics. We present a robotic system for autonomous pepper harvesting designed to operate in these unprotected, complex settings. Utilizing a custom handheld shear-gripper, we collected 300 demonstrations to train a visuomotor policy, enabling the system to adapt to varying field conditions and crop diversity. We achieved a success rate of 28.95% with a cycle time of 31.71 seconds, comparable to existing systems tested under more controlled conditions like greenhouses. Our system demonstrates the feasibility and effectiveness of leveraging imitation learning for automated harvesting in unstructured agricultural environments. This work aims to advance scalable, automated robotic solutions for agriculture in natural settings.

Paper Structure

This paper contains 26 sections, 2 equations, 11 figures, 2 tables.

Figures (11)

  • Figure 1: (a) Robotic automation in agriculture faces unique domain-specific challenges, including variable plant morphology, unpredictable lighting, and unstructured field conditions. (b) Pepper harvesting demonstrations are collected with a custom handheld shear-gripper device to train a visuomotor policy via imitation learning. (c) The trained policy enables autonomous robotic pepper harvesting in an outdoor field setting.
  • Figure 2: (a) Handheld shear-gripper device used for demonstration data collection, and (b) custom robotic end-effector counterpart used for robotic deployment. The handheld device includes a fiducial cube for pose tracking via an external camera and a fiducial tag on the shear mechanism for tracking actuation. The handheld device is manually operated by the user, while the robotic end-effector is actuated by a servo motor.
  • Figure 3: Pipeline for robust cube pose estimation: 1 Detect visible ArUco markers and solve the PnP problem for an initial cube pose estimate. 2 Project the visible cube face onto a planar view based on the initial pose. 3 Use SSIM to filter out noisy faces by comparing warped faces against expected templates. 4 Redetect and refine marker corners on SSIM-passed faces to enhance accuracy. 5 Map refined tag corners back to the original image space. 6 Recompute the PnP problem using filtered, refined tag corners for precise cube pose tracking.
  • Figure 4: The plot shows the 6-DOF pose of the fiducial cube captured during a single demonstration in a static environment. Our filter/refine method (orange line) significantly reduces noise when compared to the direct PnP method (purple line). The tracking result from ORB-SLAM3 is also plotted in green.
  • Figure 5: A harvesting demonstration begins with the operator approaching the pepper with the handheld shear-gripper device, followed by cutting and grasping the pepper’s peduncle, and finally retracting from the pepper plant. The top row displays the external camera's viewpoint capturing the fiducial cube for tracking, while the bottom row displays the viewpoint from the gripper's fisheye camera.
  • ...and 6 more figures