A Novel Approach to Tomato Harvesting Using a Hybrid Gripper with Semantic Segmentation and Keypoint Detection
Shahid Ansari, Mahendra Kumar Gohil, Bishakh Bhattacharya
TL;DR
This work addresses the challenge of selectively harvesting soft fruits, specifically tomatoes, by integrating a novel hybrid gripper with soft auxetic fingers and a rigid exoskeleton, a depth/RGB vision system for semantic segmentation of ripeness, and keypoint detection of the pedicel and tomato center. The method couples a scotch-yoke driven actuator with six soft fingers and a latex basket to achieve conformable yet strong grasping, while a conical separator enables pre-grasp separation from neighboring fruits. A Detectron-2-based vision pipeline detects ripe status and keypoints; PSO-based trajectory planning translates perception into safe, efficient manipulation. Experimental results show ~80% success and an average cycle time of ~24.34 seconds, with future work aimed at improving pedicel cutting reliability and overall gripper optimization for robust field deployment.
Abstract
Current agriculture and farming industries are able to reap advancements in robotics and automation technology to harvest fruits and vegetables using robots with adaptive grasping forces based on the compliance or softness of the fruit or vegetable. A successful operation depends on using a gripper that can adapt to the mechanical properties of the crops. This paper proposes a new robotic harvesting approach for tomato fruit using a novel hybrid gripper with a soft caging effect. It uses its six flexible passive auxetic structures based on fingers with rigid outer exoskeletons for good gripping strength and shape conformability. The gripper is actuated through a scotch-yoke mechanism using a servo motor. To perform tomato picking operations through a gripper, a vision system based on a depth camera and RGB camera implements the fruit identification process. It incorporates deep learning-based keypoint detection of the tomato's pedicel and body for localization in an occluded and variable ambient light environment and semantic segmentation of ripe and unripe tomatoes. In addition, robust trajectory planning of the robotic arm based on input from the vision system and control of robotic gripper movements are carried out for secure tomato handling. The tunable grasping force of the gripper would allow the robotic handling of fruits with a broad range of compliance.
