Geometry-aware RL for Manipulation of Varying Shapes and Deformable Objects
Tai Hoang, Huy Le, Philipp Becker, Vien Anh Ngo, Gerhard Neumann
TL;DR
This work addresses the challenge of manipulating objects with varying geometries and deformable materials by framing robotic tasks as heterogeneous graphs with distinct actuator and object nodes. It introduces Heterogeneous Equivariant Policy (HEPi), a SE(3) equivariant graph-based policy that explicitly models heterogeneity and uses an efficient equivariant MPNN backbone (PONITA-based) to enable robust 3D manipulation. A principled trust-region training approach (TRPL) is employed to stabilize on-policy learning, and a new seven-task benchmark in NVIDIA IsaacLab demonstrates improved performance, sample efficiency, and generalization over Transformer and non-heterogeneous baselines, especially in complex 3D scenarios like Cloth-Hanging and multi-agent insertions. The work advances geometric RL in robotics by combining explicit heterogeneity with SE(3) symmetry, yielding practical impact for dexterous manipulation of rigid and deformable objects in 3D spaces, while outlining avenues for incorporating full robot morphology and vision-based perception.
Abstract
Manipulating objects with varying geometries and deformable objects is a major challenge in robotics. Tasks such as insertion with different objects or cloth hanging require precise control and effective modelling of complex dynamics. In this work, we frame this problem through the lens of a heterogeneous graph that comprises smaller sub-graphs, such as actuators and objects, accompanied by different edge types describing their interactions. This graph representation serves as a unified structure for both rigid and deformable objects tasks, and can be extended further to tasks comprising multiple actuators. To evaluate this setup, we present a novel and challenging reinforcement learning benchmark, including rigid insertion of diverse objects, as well as rope and cloth manipulation with multiple end-effectors. These tasks present a large search space, as both the initial and target configurations are uniformly sampled in 3D space. To address this issue, we propose a novel graph-based policy model, dubbed Heterogeneous Equivariant Policy (HEPi), utilizing $SE(3)$ equivariant message passing networks as the main backbone to exploit the geometric symmetry. In addition, by modeling explicit heterogeneity, HEPi can outperform Transformer-based and non-heterogeneous equivariant policies in terms of average returns, sample efficiency, and generalization to unseen objects. Our project page is available at https://thobotics.github.io/hepi.
