Table of Contents
Fetching ...

Robust Loss Functions for Object Grasping under Limited Ground Truth

Yangfan Deng, Mengyao Zhang, Yong Zhao

TL;DR

This work tackles object grasping under limited ground truth by introducing two robust loss strategies. For missing labels, it combines pseudo-labeling with a novel predicted-category-probability term to better leverage unlabeled data, resulting in a final loss $L_m=\lambda_1 L_w+\lambda_2 L_u$. For noisy labels, it adopts a symmetric cross-entropy framework with a refined $s(c|d)$ to mitigate label corruption, forming $L_n=\alpha_1 L_{ce}+\alpha_2 L_{rce}$. Experiments on GraspNet-1Billion show notable improvements in accuracy and generalization, particularly on novel objects, demonstrating practical robustness for grasping systems in data-scarce or noisy environments.

Abstract

Object grasping is a crucial technology enabling robots to perceive and interact with the environment sufficiently. However, in practical applications, researchers are faced with missing or noisy ground truth while training the convolutional neural network, which decreases the accuracy of the model. Therefore, different loss functions are proposed to deal with these problems to improve the accuracy of the neural network. For missing ground truth, a new predicted category probability method is defined for unlabeled samples, which works effectively in conjunction with the pseudo-labeling method. Furthermore, for noisy ground truth, a symmetric loss function is introduced to resist the corruption of label noises. The proposed loss functions are powerful, robust, and easy to use. Experimental results based on the typical grasping neural network show that our method can improve performance by 2 to 13 percent.

Robust Loss Functions for Object Grasping under Limited Ground Truth

TL;DR

This work tackles object grasping under limited ground truth by introducing two robust loss strategies. For missing labels, it combines pseudo-labeling with a novel predicted-category-probability term to better leverage unlabeled data, resulting in a final loss . For noisy labels, it adopts a symmetric cross-entropy framework with a refined to mitigate label corruption, forming . Experiments on GraspNet-1Billion show notable improvements in accuracy and generalization, particularly on novel objects, demonstrating practical robustness for grasping systems in data-scarce or noisy environments.

Abstract

Object grasping is a crucial technology enabling robots to perceive and interact with the environment sufficiently. However, in practical applications, researchers are faced with missing or noisy ground truth while training the convolutional neural network, which decreases the accuracy of the model. Therefore, different loss functions are proposed to deal with these problems to improve the accuracy of the neural network. For missing ground truth, a new predicted category probability method is defined for unlabeled samples, which works effectively in conjunction with the pseudo-labeling method. Furthermore, for noisy ground truth, a symmetric loss function is introduced to resist the corruption of label noises. The proposed loss functions are powerful, robust, and easy to use. Experimental results based on the typical grasping neural network show that our method can improve performance by 2 to 13 percent.
Paper Structure (10 sections, 20 equations, 2 figures, 2 tables)

This paper contains 10 sections, 20 equations, 2 figures, 2 tables.

Figures (2)

  • Figure 1: The most classical steps for generating grasps. After receiving the scene data as input, the neural network will segment the objects firstly and the position of the target objects will be determined by 6D pose estimation technique. Subsequently, grasps based on the target objects can be generated. Finally, all generated grasps candidates need to be evaluated in order to find the best ones, and the evaluation methods includes collision detection, success rate of grasping and etc. The gripper of the robot is capable of performing the grasp following the generated grasps.
  • Figure 2: The representation of the final grasp. (a) The coordinate system of the gripper. (b) Our final representation of the grasp. $\bm{obj}$ denotes the center of the object. In practical grasping scenarios, the gripper will follow the direction of $\bm{v}$ to move forward for the distance of $d$ and grasp the target object with the width $w$.