Addressing imperfect symmetry: A novel symmetry-learning actor-critic extension

Miguel Abreu; Luis Paulo Reis; Nuno Lau

Addressing imperfect symmetry: A novel symmetry-learning actor-critic extension

Miguel Abreu, Luis Paulo Reis, Nuno Lau

TL;DR

The paper addresses imperfect symmetry in reinforcement learning by introducing Adaptive Symmetry Learning (ASL), a model-minimization actor-critic extension that learns to adapt symmetry mappings during training. ASL combines a dedicated symmetry-fitting component with a modular loss that enforces a common symmetric relation across states while excluding neutral regions and avoiding disadvantageous updates; it extends existing symmetry losses (MSL, PSL) with value losses and non-involutory handling. Through a four-legged ant locomotion case study, ASL demonstrates strong recovery from large perturbations, effective generalization to hidden symmetric states, and competitive performance relative to prior symmetry methods, especially in linear transformation scenarios. The work highlights ASL’s practical potential for leveraging symmetry under realistic perturbations and suggests directions for automated hyperparameter tuning and broader algorithmic integration. The overall impact lies in providing a flexible, adaptive framework to exploit symmetry without assuming perfect prior knowledge, improving data efficiency and robustness in complex robotic RL tasks.

Abstract

Symmetry, a fundamental concept to understand our environment, often oversimplifies reality from a mathematical perspective. Humans are a prime example, deviating from perfect symmetry in terms of appearance and cognitive biases (e.g. having a dominant hand). Nevertheless, our brain can easily overcome these imperfections and efficiently adapt to symmetrical tasks. The driving motivation behind this work lies in capturing this ability through reinforcement learning. To this end, we introduce Adaptive Symmetry Learning (ASL), a model-minimization actor-critic extension that addresses incomplete or inexact symmetry descriptions by adapting itself during the learning process. ASL consists of a symmetry fitting component and a modular loss function that enforces a common symmetric relation across all states while adapting to the learned policy. The performance of ASL is compared to existing symmetry-enhanced methods in a case study involving a four-legged ant model for multidirectional locomotion tasks. The results show that ASL can recover from large perturbations and generalize knowledge to hidden symmetric states. It achieves comparable or better performance than alternative methods in most scenarios, making it a valuable approach for leveraging model symmetry while compensating for inherent perturbations.

Addressing imperfect symmetry: A novel symmetry-learning actor-critic extension

TL;DR

Abstract

Paper Structure (54 sections, 37 equations, 25 figures, 31 tables)

This paper contains 54 sections, 37 equations, 25 figures, 31 tables.

Introduction
Symmetry Perturbations
Model Minimization
Relabeling states and actions
Data augmentation
Symmetric Networks
Symmetry Loss Function
Preliminaries
MDP transformations
Policy notation
Symmetric policy
Neutral states
Related work
Relabeling states and actions
Data augmentation
...and 39 more sections

Figures (25)

Figure 1: Plots for PPO's surrogate objective function $L^{C}$ as a function of ratio $r$, for a single time step, for a positive advantage estimate (on the left) or a negative advantage estimate (on the right).
Figure 2: Adaptive Symmetry Learning algorithm overview
Figure 3: Overview of the symmetry fitting process
Figure 4: 2D robot shaped as an equilateral triangle with 3 limbs, 3 symmetry planes (a, b, c) and 3-fold rotational symmetry (d, e). $f_\textbf{a}(s)$, $f_\textbf{b}(s)$, $f_\textbf{c}(s)$, $f_\textbf{d}(s)$ and $f_\textbf{e}(s)$ are symmetry transformations of $s$ relative to $\textbf{a}$, $\textbf{b}$, $\textbf{c}$, $\textbf{d}$ and $\textbf{e}$, respectively. The observations indices and multipliers characterize the transformations above.
Figure 5: Action indices and multipliers that characterize the 2D robot's symmetry transformations $g_\textbf{a}(a)$, $g_\textbf{b}(a)$, $g_\textbf{c}(a)$, $g_\textbf{d}(a)$ and $g_\textbf{e}(a)$
...and 20 more figures

Addressing imperfect symmetry: A novel symmetry-learning actor-critic extension

TL;DR

Abstract

Addressing imperfect symmetry: A novel symmetry-learning actor-critic extension

Authors

TL;DR

Abstract

Table of Contents

Figures (25)