Solving Infinite-Horizon Optimal Control Problems using the Extreme Theory of Functional Connections
Tanay Raghunandan Srinivasa, Suraj Kumar
TL;DR
The paper addresses the challenge of deriving optimal feedback laws for infinite-horizon OCPs by solving the $HJB$ PDE and computing the optimal policy via convex conjugates for control costs; it introduces X-TFC, a boundary-condition–respecting hybrid of TFC and ELM, to learn the value function efficiently. The method analytically enforces boundary conditions and uses a two-stage training strategy—ELM-based initialization followed by gradient refinement against the $HJB$ residual—resulting in fast convergence and high fidelity to analytical solutions on linear and nonlinear benchmarks, including a high-dimensional spacecraft detumbling problem. Key contributions include the closed-form policy under quadratic cost, the constrained $HJB$ learning recipe with $V(x) = \eta(x;\Theta) + (V(0) - \eta(0;\Theta))$, and extensive numerical validation showing sub-1e-3 to 1e-4 level errors and global convergence. The approach has practical impact for real-time control where boundary-condition satisfaction and computational efficiency are crucial, with potential extensions to finite-horizon problems and improved initialization strategies.
Abstract
This paper presents a physics-informed machine learning approach for synthesizing optimal feedback control policy for infinite-horizon optimal control problems by solving the Hamilton-Jacobi-Bellman (HJB) partial differential equation(PDE). The optimal control policy is derived analytically for affine dynamical systems with separable and strictly convex control costs, expressed as a function of the gradient of the value function. The resulting HJB-PDE is then solved by approximating the value function using the Extreme Theory of Functional Connections (X-TFC) - a hybrid approach that combines the Theory of Functional Connections (TFC) with the Extreme Learning Machine (ELM) algorithm. This approach ensures analytical satisfaction of boundary conditions and significantly reduces training cost compared to traditional Physics-Informed Neural Networks (PINNs). We benchmark the method on linear and non-linear systems with known analytical solutions as well as demonstrate its effectiveness on control tasks such as spacecraft optimal de-tumbling control.
