Subequivariant Reinforcement Learning Framework for Coordinated Motion Control
Haoyu Wang, Xiaoyu Tan, Xihe Qiu, Chao Qu
TL;DR
This work tackles the problem of coordinating multi-joint motion in reinforcement learning, where traditional graph networks struggle to capture inter-joint physics and symmetry under external influences like gravity. The authors introduce CoordiGraph, a gravity-aware subequivariant network that decomposes the joint graph into subgraphs, propagates equivariant information via external fields, and employs object-aware message passing to model shape-dependent interactions. The method is optimized with proximal policy optimization (PPO) and a translation-invariant feature representation to preserve symmetry while maintaining expressive power. Empirical results on MuJoCo locomotion tasks show that CoordiGraph achieves better generalization, faster learning, and improved coordination compared with standard graph networks, with ablations highlighting the contribution of subequivariant components. Overall, the paper demonstrates that embedding subequivariant priors into graph-based RL yields practical benefits for complex, high-dimensional motion control tasks.
Abstract
Effective coordination is crucial for motion control with reinforcement learning, especially as the complexity of agents and their motions increases. However, many existing methods struggle to account for the intricate dependencies between joints. We introduce CoordiGraph, a novel architecture that leverages subequivariant principles from physics to enhance coordination of motion control with reinforcement learning. This method embeds the principles of equivariance as inherent patterns in the learning process under gravity influence, which aids in modeling the nuanced relationships between joints vital for motion control. Through extensive experimentation with sophisticated agents in diverse environments, we highlight the merits of our approach. Compared to current leading methods, CoordiGraph notably enhances generalization and sample efficiency.
