Performance Index Shaping for Closed-loop Optimal Control
Ayush Rai, Shaoshuai Mou, Brian D. O. Anderson
TL;DR
This work tackles the challenge of steering closed-loop behavior by shaping the performance index in infinite-horizon optimal control. It develops an analytical link between index modifications and the resulting closed-loop law, collapsing the typical bi-level design problem into a tractable single-level formulation and proving global stability and ISS. The authors introduce a structured form for added cost terms that yields a closed-form modified value function and control law, along with a gradient-based algorithm to iteratively tune index parameters for desired trajectory-level objectives. Through linear and nonlinear examples, including overshoot reduction and pendulum-angle constraints, the approach demonstrates improved transient performance and safety without re-solving the full optimal control problem from scratch. The framework offers a principled path for co-designing performance indices and optimal laws with potential extensions to constrained, predictive, and data-driven control settings.
Abstract
The design of the performance index, also referred to as cost or reward shaping, is central to both optimal control and reinforcement learning, as it directly determines the behaviors, trade-offs, and objectives that the resulting control laws seek to achieve. A commonly used approach for this inference task in recent years is differentiable trajectory optimization, which allows gradients to be computed with respect to cost parameters by differentiating through an optimal control solver. However, this method often requires repeated solving of the underlying optimal control problem at every iteration, making the method computationally expensive. In this work, assuming known dynamics, we propose a novel framework that analytically links the performance index to the resulting closed-loop optimal control law, thereby transforming a typically bi-level inverse problem into a tractable single-level formulation. Our approach is motivated by the question: given a closed-loop control law that solves an infinite-horizon optimal control problem, how does this law change when the performance index is modified with additional terms? This formulation yields closed-form characterizations for broad classes of systems and performance indices, which not only facilitate interpretation and stability analysis, but also provide insight into the robust stability and input-to-state stable behavior of the resulting nonlinear closed-loop system. Moreover, this analytical perspective enables the generalization of our approach to diverse design objectives, yielding a unifying framework for performance index shaping. Given specific design objectives, we propose a systematic methodology to guide the shaping of the performance index and thereby design the resulting optimal control law.
