Constructive Nonlinear Control of Underactuated Systems via Zero Dynamics Policies
William Compton, Ivan Dario Jimenez Rodriguez, Noel Csomay-Shanklin, Yisong Yue, Aaron D. Ames
TL;DR
This work addresses stabilizing underactuated nonlinear systems by constructing Zero Dynamics Policies (ZDPs) that map unactuated coordinates to desired actuated coordinates, defining a controlled invariant zero dynamics manifold. It proves local existence near the origin for locally controllable systems and shows that stabilizing the zero dynamics, together with an appropriately designed output, guarantees exponential stabilization of the full state; a Lyapunov-based composite analysis underpins this result. To extend applicability, the authors develop an optimal-control–based method for learning ZDPs and demonstrate a neural-network implementation that expands the region of attraction beyond what a linear-quadratic regulator achieves. The cartpole example illustrates the practical benefits: ZDPs yield smoother responses and a larger feasible set of initial conditions, offering a principled route to stabilize a wide class of underactuated systems with data-driven enhancements.
Abstract
Stabilizing underactuated systems is an inherently challenging control task due to fundamental limitations on how the control input affects the unactuated dynamics. Decomposing the system into actuated (output) and unactuated (zero) coordinates provides useful insight as to how input enters the system dynamics. In this work, we leverage the structure of this decomposition to formalize the idea of Zero Dynamics Policies (ZDPs) -- a mapping from the unactuated coordinates to desired actuated coordinates. Specifically, we show that a ZDP exists in a neighborhood of the origin, and prove that combining output stabilization with a ZDP results in stability of the full system state. We detail a constructive method of obtaining ZDPs in a neighborhood of the origin, and propose a learning-based approach which leverages optimal control to obtain ZDPs with much larger regions of attraction. We demonstrate that such a paradigm can be used to stabilize the canonical underactuated system of the cartpole, and showcase an improvement over the nominal performance of LQR.
