Exploiting Symmetry in Dynamics for Model-Based Reinforcement Learning with Asymmetric Rewards
Yasin Sonmez, Neelay Junnarkar, Murat Arcak
TL;DR
This work tackles sample efficiency in model-based reinforcement learning by enforcing dynamical symmetry through Cartan's moving frame, yielding a reduced, $G$-invariant representation of the dynamics. It formalizes a reduced function $\bar{F}$ on $\mathcal{X}^b \times \mathcal{U}$ and reconstructs the full dynamics via $F(x,u) = \phi_{\gamma(x)}^{-1}(\bar{F}(\rho(x), \psi_{\gamma(x)}(u)))$, ensuring invariance by construction. Empirical results on Parking and Reacher show that symmetry-aware dynamics learning can achieve lower observation error with smaller networks, particularly when parameter budgets are limited, indicating improved data efficiency. This approach broadens the applicability of symmetry techniques in reinforcement learning by allowing symmetry in the dynamics to be exploited independently of reward symmetry.
Abstract
Recent work in reinforcement learning has leveraged symmetries in the model to improve sample efficiency in training a policy. A commonly used simplifying assumption is that the dynamics and reward both exhibit the same symmetry; however, in many real-world environments, the dynamical model exhibits symmetry independent of the reward model. In this paper, we assume only the dynamics exhibit symmetry, extending the scope of problems in reinforcement learning and learning in control theory to which symmetry techniques can be applied. We use Cartan's moving frame method to introduce a technique for learning dynamics that, by construction, exhibit specified symmetries. Numerical experiments demonstrate that the proposed method learns a more accurate dynamical model
