Transfer Learning for a Class of Cascade Dynamical Systems
Shima Rabiei, Sandipan Mishra, Santiago Paternain
TL;DR
The paper tackles transferring a policy learned in a reduced-order RL model to a full cascade dynamical system by exploiting an inner-loop controller. It establishes ISS-based transfer guarantees that bound performance degradation in terms of inner-loop stability parameters $\alpha$, $\beta$, and reference variation, and a Lipschitz constant $L$ linking reduced-model transitions to the commanded state $X^*$. The authors validate the theory on a quadrotor navigation task, showing that increasing the inner-loop gain $K_p$ (reducing $\alpha$) reduces transfer loss and aligns the high-order and reduced-order dynamics. This work provides a principled framework for safe and efficient RL transfer in systems with nested control loops, with practical guidance on controller design to improve transfer fidelity.
Abstract
This work considers the problem of transfer learning in the context of reinforcement learning. Specifically, we consider training a policy in a reduced order system and deploying it in the full state system. The motivation for this training strategy is that running simulations in the full-state system may take excessive time if the dynamics are complex. While transfer learning alleviates the computational issue, the transfer guarantees depend on the discrepancy between the two systems. In this work, we consider a class of cascade dynamical systems, where the dynamics of a subset of the state-space influence the rest of the states but not vice-versa. The reinforcement learning policy learns in a model that ignores the dynamics of these states and treats them as commanded inputs. In the full-state system, these dynamics are handled using a classic controller (e.g., a PID). These systems have vast applications in the control literature and their structure allows us to provide transfer guarantees that depend on the stability of the inner loop controller. Numerical experiments on a quadrotor support the theoretical findings.
