Approximate Equivariance in Reinforcement Learning
Jung Yeon Park, Sujay Bhatt, Sihan Zeng, Lawson L. S. Wong, Alec Koppel, Sumitra Ganesh, Robin Walters
TL;DR
This work introduces a formal framework for approximately equivariant reinforcement learning by defining $(G,\epsilon_R,\epsilon_P)$-invariant MDPs and proving that the optimal $Q$-function is approximately invariant under symmetry transformations. It develops a practical architecture based on relaxed group and steerable convolutions to learn policies and value functions that are robust to symmetry breaking, and provides theoretical guarantees on near-invariance of $Q^{*}$. Empirically, the authors show that approximately equivariant RL achieves strong performance and robustness across continuous control tasks and a real-world stock trading dataset, often outperforming exact-equivariant baselines when symmetry is imperfect. The approach improves sample efficiency, resilience to noise, and can adapt to symmetry-breaking factors, offering a flexible inductive bias for RL in realistic environments. The work also provides public code to reproduce the results.
Abstract
Equivariant neural networks have shown great success in reinforcement learning, improving sample efficiency and generalization when there is symmetry in the task. However, in many problems, only approximate symmetry is present, which makes imposing exact symmetry inappropriate. Recently, approximately equivariant networks have been proposed for supervised classification and modeling physical systems. In this work, we develop approximately equivariant algorithms in reinforcement learning (RL). We define approximately equivariant MDPs and theoretically characterize the effect of approximate equivariance on the optimal $Q$ function. We propose novel RL architectures using relaxed group and steerable convolutions and experiment on several continuous control domains and stock trading with real financial data. Our results demonstrate that the approximately equivariant network performs on par with exactly equivariant networks when exact symmetries are present, and outperforms them when the domains exhibit approximate symmetry. As an added byproduct of these techniques, we observe increased robustness to noise at test time. Our code is available at https://github.com/jypark0/approx_equiv_rl.
