Leveraging Symmetry to Accelerate Learning of Trajectory Tracking Controllers for Free-Flying Robotic Systems

Jake Welde; Nishanth Rao; Pratik Kunapuli; Dinesh Jayaraman; Vijay Kumar

Leveraging Symmetry to Accelerate Learning of Trajectory Tracking Controllers for Free-Flying Robotic Systems

Jake Welde, Nishanth Rao, Pratik Kunapuli, Dinesh Jayaraman, Vijay Kumar

TL;DR

This work addresses the data inefficiency of reinforcement learning for tracking controllers in free-flying robots by exploiting continuous Lie group symmetries. It formulates trajectory tracking as a stationary continuous MDP and proves that symmetry induces a quotient MDP via an MDP homomorphism, enabling policy lifting to the original system with preserved optimality and Q-values. The authors derive explicit quotient constructions for Particle, Astrobee, and Quadrotor, demonstrating accelerated training and lower tracking error through symmetry-aware learning, including zero-shot generalization to planned trajectories. The framework provides a principled method to reduce problem dimensionality while maintaining performance, with practical impact on efficient RL for complex robotic systems.

Abstract

Tracking controllers enable robotic systems to accurately follow planned reference trajectories. In particular, reinforcement learning (RL) has shown promise in the synthesis of controllers for systems with complex dynamics and modest online compute budgets. However, the poor sample efficiency of RL and the challenges of reward design make training slow and sometimes unstable, especially for high-dimensional systems. In this work, we leverage the inherent Lie group symmetries of robotic systems with a floating base to mitigate these challenges when learning tracking controllers. We model a general tracking problem as a Markov decision process (MDP) that captures the evolution of both the physical and reference states. Next, we prove that symmetry in the underlying dynamics and running costs leads to an MDP homomorphism, a mapping that allows a policy trained on a lower-dimensional "quotient" MDP to be lifted to an optimal tracking controller for the original system. We compare this symmetry-informed approach to an unstructured baseline, using Proximal Policy Optimization (PPO) to learn tracking controllers for three systems: the Particle (a forced point mass), the Astrobee (a fullyactuated space robot), and the Quadrotor (an underactuated system). Results show that a symmetry-aware approach both accelerates training and reduces tracking error at convergence.

Leveraging Symmetry to Accelerate Learning of Trajectory Tracking Controllers for Free-Flying Robotic Systems

TL;DR

Abstract

Paper Structure (12 sections, 4 theorems, 45 equations, 1 figure, 1 table)

This paper contains 12 sections, 4 theorems, 45 equations, 1 figure, 1 table.

Introduction
Background and Preliminaries
Homomorphisms of Markov Decision Processes
Lie Group Symmetries of Markov Decision Processes
Tracking Control Problems with Lie Group Symmetries
Modeling a Tracking Control Problem as an MDP
Symmetries of Tracking Control MDPs
Continuous MDP Homomorphisms Induced by Lie Group Symmetries
Quotient MDPs for Tracking Control in Free-Flying Robotic Systems
Experiments
Discussion
Conclusion

Key Result

Theorem 1

Suppose $(p,h)$ is an MDP homomorphism from $\mathcal{M}$ to $\widetilde{\mathcal{M}}$ and $\pi$ is a lift of any policy $\widetilde{\pi}$ for $\widetilde{\mathcal{M}}$. Then, ${ Q^\pi(s,a) = \widetilde{Q}^{\widetilde{\pi}}(p(s),h(s,a)). }$ Moreover, if $\widetilde{\pi}$ is optimal for $\widetilde{

Figures (1)

Figure 1: Mean reward during training (for 10 training seeds) and, for the best-performing seed, mean tracking error during evaluation (for 20 trajectories), with translational errors as solid lines and rotational errors as dashed lines.

Theorems & Definitions (21)

Definition 1: see Panangaden2024
Definition 2: see Panangaden2024
Theorem 1: see Panangaden2024
Definition 3
Remark 1
Definition 4
Example 1: name=Particle,label=example:particle
Definition 5
Remark 2
Example 2: name=Particle ,continues=example:particle
...and 11 more

Leveraging Symmetry to Accelerate Learning of Trajectory Tracking Controllers for Free-Flying Robotic Systems

TL;DR

Abstract

Leveraging Symmetry to Accelerate Learning of Trajectory Tracking Controllers for Free-Flying Robotic Systems

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (1)

Theorems & Definitions (21)