Table of Contents
Fetching ...

Embodied Uncertainty-Aware Object Segmentation

Xiaolin Fang, Leslie Pack Kaelbling, Tomás Lozano-Pérez

TL;DR

The paper tackles ambiguity in object segmentation for embodied robotics by introducing UncOS, which generates a distribution over region-level segmentation hypotheses via repeated prompting of large pre-trained models. Building on this, EOS creates a 3D belief state and a belief-guided action planner that selects robot perturbations to reduce segmentation uncertainty in a closed loop with observations. Empirical results show that UncOS achieves state-of-the-art-like performance on unseen objects in single-image settings, and that the embodied extension EOS enables more efficient disambiguation through targeted interactions in real robots. Overall, the work advances embodied perception by coupling region-level uncertainty with action-driven disambiguation, enabling more reliable manipulation in cluttered, unknown-object scenes.

Abstract

We introduce uncertainty-aware object instance segmentation (UncOS) and demonstrate its usefulness for embodied interactive segmentation. To deal with uncertainty in robot perception, we propose a method for generating a hypothesis distribution of object segmentation. We obtain a set of region-factored segmentation hypotheses together with confidence estimates by making multiple queries of large pre-trained models. This process can produce segmentation results that achieve state-of-the-art performance on unseen object segmentation problems. The output can also serve as input to a belief-driven process for selecting robot actions to perturb the scene to reduce ambiguity. We demonstrate the effectiveness of this method in real-robot experiments. Website: https://sites.google.com/view/embodied-uncertain-seg

Embodied Uncertainty-Aware Object Segmentation

TL;DR

The paper tackles ambiguity in object segmentation for embodied robotics by introducing UncOS, which generates a distribution over region-level segmentation hypotheses via repeated prompting of large pre-trained models. Building on this, EOS creates a 3D belief state and a belief-guided action planner that selects robot perturbations to reduce segmentation uncertainty in a closed loop with observations. Empirical results show that UncOS achieves state-of-the-art-like performance on unseen objects in single-image settings, and that the embodied extension EOS enables more efficient disambiguation through targeted interactions in real robots. Overall, the work advances embodied perception by coupling region-level uncertainty with action-driven disambiguation, enabling more reliable manipulation in cluttered, unknown-object scenes.

Abstract

We introduce uncertainty-aware object instance segmentation (UncOS) and demonstrate its usefulness for embodied interactive segmentation. To deal with uncertainty in robot perception, we propose a method for generating a hypothesis distribution of object segmentation. We obtain a set of region-factored segmentation hypotheses together with confidence estimates by making multiple queries of large pre-trained models. This process can produce segmentation results that achieve state-of-the-art performance on unseen object segmentation problems. The output can also serve as input to a belief-driven process for selecting robot actions to perturb the scene to reduce ambiguity. We demonstrate the effectiveness of this method in real-robot experiments. Website: https://sites.google.com/view/embodied-uncertain-seg
Paper Structure (15 sections, 4 equations, 3 figures, 2 tables, 3 algorithms)

This paper contains 15 sections, 4 equations, 3 figures, 2 tables, 3 algorithms.

Figures (3)

  • Figure 1: Embodied segmentation with uncertainty-aware object segmentation model ( UncOS) as a basis. EOS architecture: an initial rgb-d image is repeatedly prompted by UncOS to obtain a region-based factored segmentation hypotheses distribution. Unambiguous regions are put into the confident set (outlined in green). Alternative hypotheses are proposed for each uncertain region. A distribution over segmentation hypotheses for the whole image is constructed by taking the Cartesian product of hypothesis distributions in each region. Such factored hypothesis distribution is used to initialize a 3D belief state that forms the basis for information-gathering action planning in embodied object segmentation.
  • Figure 2: Objects used for real-world evaluation.
  • Figure 3: Qualitative results from embodied segmentation. From left to right are the most likely segmentation results after 0 to 3 steps of interaction using EOS. Incorrect and corrected segmentations are highlighted using red and green dashed circles. $F_n$ and change in $F_n$ to the previous frame are shown in the corner.