A Model-Free Data-Driven Algorithm for Continuous-Time Control
Sean R. Bowerfind, Matthew R. Kirchner, Gary A. Hewer, D. Reed Robinson, Paula Chen, Alireza Farahmandi, Katia Estabridis
TL;DR
This work tackles model-free synthesis of infinite-horizon LQR controllers for continuous-time systems using only finite, noisy input-output data. It derives a trajectory-based necessary condition on the value function along observed trajectories, yielding an implicit, model-free formulation that avoids explicit knowledge of $A$ and $B$, and casts the problem as an NLP over $L$ with $P=L^T L$ and $S=PB$ to obtain the stabilizing gain $K=R^{-1}S^T$. The approach is shown to be equivalent to a continuous-time Q-learning perspective, with discrete-time implementation offering improved numerical stability. Two case studies—a known linear Boeing-747–style system and an unknown nonlinear quadcopter—demonstrate that the model-free gains closely approximate the classical LQR gains and deliver comparable closed-loop performance under data-driven conditions. This provides a data-driven, offline alternative for LQR design suitable for systems where first-principles models are unavailable or hard to linearize, with future work focusing on data-design metrics and real-flight validation.
Abstract
Presented is an algorithm to synthesize an infinite-horizon LQR optimal feedback controller for continuous-time systems. The algorithm does not require knowledge of the system dynamics, but instead uses only a finite-length sampling of (possibly suboptimal) input-output data. The algorithm is based on a constrained optimization problem that enforces a necessary condition on the dynamics of the optimal value function along an arbitrary trajectory. This paper presents the derivation as well as shows examples applied to both linear and nonlinear systems inspired by air vehicles.
