Information-driven Affordance Discovery for Efficient Robotic Manipulation

Pietro Mazzaglia; Taco Cohen; Daniel Dijkman

Information-driven Affordance Discovery for Efficient Robotic Manipulation

Pietro Mazzaglia, Taco Cohen, Daniel Dijkman

TL;DR

IDA introduces an information-driven framework for visual affordance discovery by casting the problem as a contextual bandit and using a Jensen-Shannon divergence-based information gain to guide exploration. An ensemble of decoders with a shared encoder yields per-pixel affordance maps from a 2D point-cloud input, enabling data-efficient learning of multiple manipulation primitives such as grasping, stacking, and opening. Across ManiSkill2 simulations and real-world grasping with a UFactory xArm 6, IDA demonstrates superior data efficiency and robustness, with ablations highlighting the benefits of information-driven sampling and the reward term. The work advances practical robotic manipulation by reducing the interaction burden needed to learn useful affordances and suggests avenues for extending to longer-horizon tasks and hierarchical control. All mathematical formulations are wrapped in $...$ to ensure precise notation.

Abstract

Robotic affordances, providing information about what actions can be taken in a given situation, can aid robotic manipulation. However, learning about affordances requires expensive large annotated datasets of interactions or demonstrations. In this work, we argue that well-directed interactions with the environment can mitigate this problem and propose an information-based measure to augment the agent's objective and accelerate the affordance discovery process. We provide a theoretical justification of our approach and we empirically validate the approach both in simulation and real-world tasks. Our method, which we dub IDA, enables the efficient discovery of visual affordances for several action primitives, such as grasping, stacking objects, or opening drawers, strongly improving data efficiency in simulation, and it allows us to learn grasping affordances in a small number of interactions, on a real-world setup with a UFACTORY XArm 6 robot arm.

Information-driven Affordance Discovery for Efficient Robotic Manipulation

TL;DR

to ensure precise notation.

Abstract

Paper Structure (11 sections, 13 equations, 6 figures, 2 tables)

This paper contains 11 sections, 13 equations, 6 figures, 2 tables.

Introduction
Related Work
Information-driven Affordance Discovery
Implementation
Experiments
Simulation experiments
Real-world experiments
Conclusion
Information radius derivation
Training details
Evaluation details

Figures (6)

Figure 1: Information-driven affordance discovery. The model processes inputs from the environment (2D point cloud) using a single encoder, concatenates action parameters and decodes visual affordance maps with multiple decoders (ensemble). Averaging the outputs of these networks, we can extract reliable affordance maps, thanks to the ensemble diversity. Computing the information radius, we can obtain information gain maps about affordances in the scene, to drive considerate explorative interactions. Images represent actual model outputs from IDA in the ManiSkill2 Open Drawer environment.
Figure 2: ManiSkill2 performance. Affordance success aggregated across ManiSkill2 tasks and runs.
Figure 3: Performance over time. The affordance success rate in the evaluation stage increases over the number of interactions, averaged over all tasks (5+ seeds per task).
Figure 4: Reward-free ablation. Comparing reward-free affordance discovery methods over time. (5+ seeds).
Figure 5: Real-world results and setup. IDA learns to grasp objects faster than other approaches, achieving up to 90% grasping success, on a UFACTORY xArm 6 platform.
...and 1 more figures

Information-driven Affordance Discovery for Efficient Robotic Manipulation

TL;DR

Abstract

Information-driven Affordance Discovery for Efficient Robotic Manipulation

Authors

TL;DR

Abstract

Table of Contents

Figures (6)