Cross-Embodied Affordance Transfer through Learning Affordance Equivalences
Hakan Aktas, Yukie Nagai, Minoru Asada, Matteo Saveriano, Erhan Oztop, Emre Ugur
TL;DR
This work addresses how to learn affordances that couple objects, actions, and effects across agents by formulating a shared affordance space and Affordance Equivalence. It introduces a multi-channel CNMP-based architecture that encodes object depth maps and time-series of actions and effects into latent vectors, blends them into a common representation $L^F$ via convex weights, and decodes to complete affordance components, enabling cross-embodiment transfer and direct imitation. The authors validate their approach through insertability, graspability, and rollability experiments, plus a real-robot imitation test, demonstrating object- and agent-level equivalences and transfer across diverse robots and objects. Results show the proposed method outperforms baselines in reconstruction and transfer tasks and can operate with partial input channels, highlighting practical potential for cross-robot skill transfer. Limitations include the need to retrain when adding new robots and assumptions about time-series consistency, with future work aimed at more diverse morphologies and ambiguity handling.
Abstract
Affordances represent the inherent effect and action possibilities that objects offer to the agents within a given context. From a theoretical viewpoint, affordances bridge the gap between effect and action, providing a functional understanding of the connections between the actions of an agent and its environment in terms of the effects it can cause. In this study, we propose a deep neural network model that unifies objects, actions, and effects into a single latent vector in a common latent space that we call the affordance space. Using the affordance space, our system can generate effect trajectories when action and object are given and can generate action trajectories when effect trajectories and objects are given. Our model does not learn the behavior of individual objects acted upon by a single agent. Still, rather, it forms a `shared affordance representation' spanning multiple agents and objects, which we call Affordance Equivalence. Affordance Equivalence facilitates not only action generalization over objects but also Cross Embodiment transfer linking actions of different robots. In addition to the simulation experiments that demonstrate the proposed model's range of capabilities, we also showcase that our model can be used for direct imitation in real-world settings.
