Table of Contents
Fetching ...

vMF-Contact: Uncertainty-aware Evidential Learning for Probabilistic Contact-grasp in Noisy Clutter

Yitian Shi, Edgar Welte, Maximilian Gilles, Rania Rayyes

TL;DR

This work tackles robust 6-DoF grasping in cluttered, uncertain environments by explicitly separating and modeling aleatoric and epistemic uncertainties. It introduces vMF-Contact, an evidential learning framework that uses a von Mises–Fisher posterior to capture directional uncertainty in contact grasps, paired with a Bayesian loss and an auxiliary point-reconstruction task to enhance feature expressiveness. The method provides principled posterior updates, an informative prior aligned to surface normals, and demonstrates improved uncertainty calibration, OOD generalization, and real-world grasp success without sim-to-real transfer. The combination of probabilistic grasp representations, normalizing-flow–based evidence estimation, and auxiliary reconstruction yields practical, real-time uncertainty-aware grasping suitable for cluttered, noisy environments with OOD objects.

Abstract

Grasp learning in noisy environments, such as occlusions, sensor noise, and out-of-distribution (OOD) objects, poses significant challenges. Recent learning-based approaches focus primarily on capturing aleatoric uncertainty from inherent data noise. The epistemic uncertainty, which represents the OOD recognition, is often addressed by ensembles with multiple forward paths, limiting real-time application. In this paper, we propose an uncertainty-aware approach for 6-DoF grasp detection using evidential learning to comprehensively capture both uncertainties in real-world robotic grasping. As a key contribution, we introduce vMF-Contact, a novel architecture for learning hierarchical contact grasp representations with probabilistic modeling of directional uncertainty as von Mises-Fisher (vMF) distribution. To achieve this, we analyze the theoretical formulation of the second-order objective on the posterior parametrization, providing formal guarantees for the model's ability to quantify uncertainty and improve grasp prediction performance. Moreover, we enhance feature expressiveness by applying partial point reconstructions as an auxiliary task, improving the comprehension of uncertainty quantification as well as the generalization to unseen objects. In the real-world experiments, our method demonstrates a significant improvement by 39% in the overall clearance rate compared to the baselines. The code is available under: https://github.com/YitianShi/vMF-Contact/

vMF-Contact: Uncertainty-aware Evidential Learning for Probabilistic Contact-grasp in Noisy Clutter

TL;DR

This work tackles robust 6-DoF grasping in cluttered, uncertain environments by explicitly separating and modeling aleatoric and epistemic uncertainties. It introduces vMF-Contact, an evidential learning framework that uses a von Mises–Fisher posterior to capture directional uncertainty in contact grasps, paired with a Bayesian loss and an auxiliary point-reconstruction task to enhance feature expressiveness. The method provides principled posterior updates, an informative prior aligned to surface normals, and demonstrates improved uncertainty calibration, OOD generalization, and real-world grasp success without sim-to-real transfer. The combination of probabilistic grasp representations, normalizing-flow–based evidence estimation, and auxiliary reconstruction yields practical, real-time uncertainty-aware grasping suitable for cluttered, noisy environments with OOD objects.

Abstract

Grasp learning in noisy environments, such as occlusions, sensor noise, and out-of-distribution (OOD) objects, poses significant challenges. Recent learning-based approaches focus primarily on capturing aleatoric uncertainty from inherent data noise. The epistemic uncertainty, which represents the OOD recognition, is often addressed by ensembles with multiple forward paths, limiting real-time application. In this paper, we propose an uncertainty-aware approach for 6-DoF grasp detection using evidential learning to comprehensively capture both uncertainties in real-world robotic grasping. As a key contribution, we introduce vMF-Contact, a novel architecture for learning hierarchical contact grasp representations with probabilistic modeling of directional uncertainty as von Mises-Fisher (vMF) distribution. To achieve this, we analyze the theoretical formulation of the second-order objective on the posterior parametrization, providing formal guarantees for the model's ability to quantify uncertainty and improve grasp prediction performance. Moreover, we enhance feature expressiveness by applying partial point reconstructions as an auxiliary task, improving the comprehension of uncertainty quantification as well as the generalization to unseen objects. In the real-world experiments, our method demonstrates a significant improvement by 39% in the overall clearance rate compared to the baselines. The code is available under: https://github.com/YitianShi/vMF-Contact/

Paper Structure

This paper contains 24 sections, 9 equations, 5 figures, 2 tables.

Figures (5)

  • Figure 1: Inference pipeline for vMF-Contact in real-world based on posterior update.
  • Figure 2: Illustration of posterior update for grasp baseline direction wrt. predictive distributional uncertainty. We denote $R_{t}$ as the rotation matrix to transform $\mathbf{b}$ to the $\mathbf{t}$-th bin.
  • Figure 3: vMF-Contact architecture. The raw point clouds from a single camera view go through the PointNet-based backbone (I), where the feature is further enhanced by geometric-aware self-attention. The down-scaled point features are taken by (II) pointwise linear layers to predict the conditional grasp orientations. A residual flow is trained to estimate the density of point features (III) and provide the evidence that serves the posterior update (IV). Auxiliary point completion (IV) is applied using a shared folding net to enhance the feature expressiveness (green points) and robustness against input noises supervised by "ground truths" point clouds (blue).
  • Figure 4: i) Generated data from simulation with ground truths point clouds ii) Ground truths grasps in the scene iii) ID objects iv) OOD objects
  • Figure 5: Visualization of predicted grasps (red) over the graspness threshold 0.5 compared with their nearest ground truths grasps (green). The smaller the sizes of red balls refer to higher total uncertainty.