Table of Contents
Fetching ...

Bounding the Error of Value Functions in Sobolev Norm Yields Bounds on Suboptimality of Controller Performance

Morgan Jones, Matthew Peet

TL;DR

The paper tackles suboptimality guarantees for controllers synthesized from approximate value functions (VFs) in finite-horizon nonlinear control. It proves a sharp suboptimality bound: the cost gap $L(u_J,x_0)$ is bounded by a constant times the Sobolev norm error $\|J - V^*\|_{W^{1,\infty}(B_R(0)\times[0,T])}$, enabling arbitrarily small performance loss when VF approximations converge in $W^{1,\infty}$; this does not hold for $L^ty$ error. The core argument uses a modified OCP in which $J$ plays the role of the VF and the HJB framework for viscosity solutions to establish the bound, with explicit dependence on horizon and system growth parameters. Numerical examples illustrate that uniform convergence of $J$ to $V^*$ can fail to guarantee near-optimal performance unless the convergence is in the Sobolev sense, and confirm that the bound behaves as predicted with respect to the Sobolev error. The results justify prioritizing Sobolev-error control in numerical VF approximations (e.g., PINNs) to obtain provable controller performance guarantees in continuous-time, finite-horizon settings.

Abstract

Optimal feedback controllers for nonlinear systems can be derived by solving the Hamilton-Jacobi-Bellman (HJB) equation. However, because the HJB is a nonlinear partial differential equation, in general only approximate solutions can be numerically found. While numerical error bounds on approximate HJB solutions are often available, we show that these bounds do not necessarily translate into guarantees on the suboptimality of the resulting controllers. In this paper, we establish that if the numerical error in the HJB solution can be bounded in a Sobolev norm, a norm involving spatial derivatives, then the suboptimality of the corresponding feedback controller can also be bounded, and this bound can be made arbitrarily small. In contrast, we demonstrate that such guarantees do not hold when the error is measured in more typical norms, such as the uniform norm ($L^\infty$). Our results apply to systems governed by locally Lipschitz continuous dynamics over a finite time horizon with a compact input space. Numerical examples are provided to illustrate the theoretical findings.

Bounding the Error of Value Functions in Sobolev Norm Yields Bounds on Suboptimality of Controller Performance

TL;DR

The paper tackles suboptimality guarantees for controllers synthesized from approximate value functions (VFs) in finite-horizon nonlinear control. It proves a sharp suboptimality bound: the cost gap is bounded by a constant times the Sobolev norm error , enabling arbitrarily small performance loss when VF approximations converge in ; this does not hold for error. The core argument uses a modified OCP in which plays the role of the VF and the HJB framework for viscosity solutions to establish the bound, with explicit dependence on horizon and system growth parameters. Numerical examples illustrate that uniform convergence of to can fail to guarantee near-optimal performance unless the convergence is in the Sobolev sense, and confirm that the bound behaves as predicted with respect to the Sobolev error. The results justify prioritizing Sobolev-error control in numerical VF approximations (e.g., PINNs) to obtain provable controller performance guarantees in continuous-time, finite-horizon settings.

Abstract

Optimal feedback controllers for nonlinear systems can be derived by solving the Hamilton-Jacobi-Bellman (HJB) equation. However, because the HJB is a nonlinear partial differential equation, in general only approximate solutions can be numerically found. While numerical error bounds on approximate HJB solutions are often available, we show that these bounds do not necessarily translate into guarantees on the suboptimality of the resulting controllers. In this paper, we establish that if the numerical error in the HJB solution can be bounded in a Sobolev norm, a norm involving spatial derivatives, then the suboptimality of the corresponding feedback controller can also be bounded, and this bound can be made arbitrarily small. In contrast, we demonstrate that such guarantees do not hold when the error is measured in more typical norms, such as the uniform norm (). Our results apply to systems governed by locally Lipschitz continuous dynamics over a finite time horizon with a compact input space. Numerical examples are provided to illustrate the theoretical findings.

Paper Structure

This paper contains 5 sections, 4 theorems, 37 equations, 2 figures.

Key Result

Theorem 2.1

Consider the family of OCPs in Eq. opt: optimal control probelm. Suppose $V \in C^1(\mathbb{R}^n \times \mathbb{R}, \mathbb{R})$ solves the HJB PDE eqn: general HJB PDE. Then $V(x,t)=V^*(x,t)$ where $V^*$ is the VF defined in Eq. opt: optimal control probelm and $u^*: [t_0,T] \to U$ solves the OCP i

Figures (2)

  • Figure 1: Plot showing close VFs, $\textcolor{blue}{V^*} \approx \textcolor{red}{J}$, that result in very different inputs, $\textcolor{red}{u_J} \not\approx \textcolor{blue}{u^*}$, for fixed time and state.
  • Figure 2: (\ref{['fig: Illistrative example']}) Graph showing trajectories of Example \ref{['ex:lib']} associated with input $u_{V_1}$, where $V_1$ is given in Eq. \ref{['eq: V1 and V2']}, $\varepsilon=\frac{1}{n}$ and $n=1$ to $20000$. (\ref{['fig: lokta']}) Graph showing numerically calculated performance gaps, $L(u_{V_1},x_0)$ and $L(u_{V_2},x_0)$, using controllers synthesized from the approximate VFs given in Eq. \ref{['eq: V1 and V2']} for decreasing $\varepsilon>0$, along with the theoretical bounds for these performance gaps given in Eq. \ref{['eq: ex1 theoretical perfoamnce bounds']}. (\ref{['fig: coupled linear ODE']}) Graph showing numerically calculated logarithms of performance gap, $L(u_{\hat{V}_\varepsilon},x_0)$, using controllers synthesized from the approximate VFs given in Eq. \ref{['eq: approx VF example 2']} for decreasing $\varepsilon>0$, along with the logarithm of the theoretical performance gap bounds given in Eq. \ref{['eq: perf bound ex 2']}.

Theorems & Definitions (9)

  • Definition 1
  • Theorem 2.1: liberzon2011calculus
  • Lemma 2.2: Page 555 evans2010partial
  • Lemma 3.1
  • proof
  • Theorem 3.2
  • proof
  • Example 1
  • Example 2