Information-driven Affordance Discovery for Efficient Robotic Manipulation
Pietro Mazzaglia, Taco Cohen, Daniel Dijkman
TL;DR
This work addresses the data-inefficiency of learning visual affordances for robotic manipulation by reframing affordance discovery as a contextual bandit and introducing Information-Driven Affordance Discovery (IDA). IDA uses an ensemble of decoders with a shared encoder to output per-pixel affordance probabilities and guides exploration with an information gain term $I(x,a)$ computed as the Jensen–Shannon Divergence across ensemble parameters, combined with reward via a UCB-like strategy. The approach yields higher data efficiency and robust final performance in ManiSkill2 simulation and enables fast, real-world grasping on a UArm 6 with no prior data, highlighting its practical impact for interactive, data-conscious robotics. These contributions advance how perception-guided exploration can accelerate learning of actionable visual affordances while reducing reliance on large annotated or synthetic datasets.
Abstract
Robotic affordances, providing information about what actions can be taken in a given situation, can aid robotic manipulation. However, learning about affordances requires expensive large annotated datasets of interactions or demonstrations. In this work, we argue that well-directed interactions with the environment can mitigate this problem and propose an information-based measure to augment the agent's objective and accelerate the affordance discovery process. We provide a theoretical justification of our approach and we empirically validate the approach both in simulation and real-world tasks. Our method, which we dub IDA, enables the efficient discovery of visual affordances for several action primitives, such as grasping, stacking objects, or opening drawers, strongly improving data efficiency in simulation, and it allows us to learn grasping affordances in a small number of interactions, on a real-world setup with a UFACTORY XArm 6 robot arm.
