Table of Contents
Fetching ...

Coupled Particle Filters for Robust Affordance Estimation

Patrick Lowin, Vito Mengers, Oliver Brock

Abstract

Robotic affordance estimation is challenging due to visual, geometric, and semantic ambiguities in sensory input. We propose a method that disambiguates these signals using two coupled recursive estimators for sub-aspects of affordances: graspable and movable regions. Each estimator encodes property-specific regularities to reduce uncertainty, while their coupling enables bidirectional information exchange that focuses attention on regions where both agree, i.e., affordances. Evaluated on a real-world dataset, our method outperforms three recent affordance estimators (Where2Act, Hands-as-Probes, and HRP) by 308%, 245%, and 257% in precision, and remains robust under challenging conditions such as low light or cluttered environments. Furthermore, our method achieves a 70% success rate in our real-world evaluation. These results demonstrate that coupling complementary estimators yields precise, robust, and embodiment-appropriate affordance predictions.

Coupled Particle Filters for Robust Affordance Estimation

Abstract

Robotic affordance estimation is challenging due to visual, geometric, and semantic ambiguities in sensory input. We propose a method that disambiguates these signals using two coupled recursive estimators for sub-aspects of affordances: graspable and movable regions. Each estimator encodes property-specific regularities to reduce uncertainty, while their coupling enables bidirectional information exchange that focuses attention on regions where both agree, i.e., affordances. Evaluated on a real-world dataset, our method outperforms three recent affordance estimators (Where2Act, Hands-as-Probes, and HRP) by 308%, 245%, and 257% in precision, and remains robust under challenging conditions such as low light or cluttered environments. Furthermore, our method achieves a 70% success rate in our real-world evaluation. These results demonstrate that coupling complementary estimators yields precise, robust, and embodiment-appropriate affordance predictions.
Paper Structure (23 sections, 4 equations, 10 figures, 1 algorithm)

This paper contains 23 sections, 4 equations, 10 figures, 1 algorithm.

Figures (10)

  • Figure 1: Robots can interact by grasping an object and then moving it. Consequently, affordances emerge in regions where these properties co-occur. By estimating these properties separately and leveraging their relationship, we can resolve ambiguities in both modalities and estimate robust affordances.
  • Figure 2: Our approach estimates affordances as a combination of the graspable and movable regions of our environment. Since our measurement sources provide only uncertain measurements for these properties, we recursively estimate a belief for them and apply additional priors to resolve ambiguities. This aggregates a strong belief in regions with high measurements for their individual property. However, affordances need to be both graspable and movable. Therefore, we exchange information between our estimators, focusing each estimator's attention on relevant parts of the scene that satisfy both properties, i.e., affordances.
  • Figure 3: Our method can handle dynamic and complex scenes. While the non-recursive learning-based approaches detect new affordances directly, we rely on our injection step to introduce new affordances, such as the object inside the shelf, into our belief. Our estimates are more precise, focusing on manipulable affordances.
  • Figure 4: Our approach outperforms recent learning-based approaches in both standard, i.e., well-lit and clutter-free (top), and more challenging, i.e., dark or cluttered (bottom), settings. The monolithic learned models struggle with real-world ambiguities, while coupling resolves them by aligning estimator attention and filtering noise.
  • Figure 5: Our affordance estimates remain precise in dark environments. Hands-as-Probes hands-as-probes and HRP srirama2024hrp suffer from visual ambiguities, and Where2Act where2act struggles with object complexity, while our coupled system yields reliable affordances.
  • ...and 5 more figures