Recovery of the optimal control value function in reproducing kernel Hilbert spaces from verification conditions

Tobias Ehring; Behzad Azmi; Bernard Haasdonk

Recovery of the optimal control value function in reproducing kernel Hilbert spaces from verification conditions

Tobias Ehring, Behzad Azmi, Bernard Haasdonk

TL;DR

The paper develops a verification-based, reproducing-kernel-Hilbert-space framework to recover the infinite-horizon optimal value function v^* for nonlinear autonomous OCPs by enforcing Hamilton-Jacobi-Bellman verification conditions. It recasts the problem as nonlinear optimal recovery in RKHSs and shows finite-dimensional reductions that yield practical algorithms, with a Gauss-Newton interpretation that is algorithmically equivalent to policy iteration (RKHS–PI). Convergence is established in two regimes: global convergence when v^* is real-analytic (via Gaussian RKHS) and local convergence under two-sided quadratic bounds near the origin. Numerical experiments across toy, mechanical, and PDE-inspired models (including a 50D linear heat equation) demonstrate rapid convergence and the effectiveness of structure-aware kernels in high-dimensional settings.

Abstract

Approximating the optimal value function $v^*$ for infinite-horizon, nonlinear, autonomous optimal control problems is both challenging and essential for synthesizing real-time optimal feedback. We develop an abstract optimal recovery framework in reproducing kernel Hilbert spaces (RKHS) for reconstructing unknown target functions from mixed equality and inequality functional constraints. Within this framework, the approximation of $v^*$ is cast as a collocation-type problem derived from verification conditions for optimality -- most prominently, the Hamilton-Jacobi-Bellman (HJB) equation -- that uniquely characterizes $v^*$. As the set of collocation points becomes dense in the ambient domain $Ω$, we establish convergence of the RKHS approximants to $v^*$: globally on $Ω$ in the RKHS norm when $v^*$ is analytic, and locally (in a neighborhood of the origin) in the RKHS norm when $v^*$ is bounded from above and below by quadratic functions. Furthermore, we show that a practical numerical realization of the abstract scheme reduces to the classical policy iteration algorithm. Numerical experiments support the effectiveness of the proposed approach.

Recovery of the optimal control value function in reproducing kernel Hilbert spaces from verification conditions

TL;DR

Abstract

Approximating the optimal value function

for infinite-horizon, nonlinear, autonomous optimal control problems is both challenging and essential for synthesizing real-time optimal feedback. We develop an abstract optimal recovery framework in reproducing kernel Hilbert spaces (RKHS) for reconstructing unknown target functions from mixed equality and inequality functional constraints. Within this framework, the approximation of

is cast as a collocation-type problem derived from verification conditions for optimality -- most prominently, the Hamilton-Jacobi-Bellman (HJB) equation -- that uniquely characterizes

. As the set of collocation points becomes dense in the ambient domain

, we establish convergence of the RKHS approximants to

: globally on

in the RKHS norm when

is analytic, and locally (in a neighborhood of the origin) in the RKHS norm when

is bounded from above and below by quadratic functions. Furthermore, we show that a practical numerical realization of the abstract scheme reduces to the classical policy iteration algorithm. Numerical experiments support the effectiveness of the proposed approach.

Recovery of the optimal control value function in reproducing kernel Hilbert spaces from verification conditions

TL;DR

Abstract

Recovery of the optimal control value function in reproducing kernel Hilbert spaces from verification conditions

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (27)