Inferring System and Optimal Control Parameters of Closed-Loop Systems from Partial Observations
Victor Geadah, Juncal Arbelaiz, Harrison Ritz, Nathaniel D. Daw, Jonathan D. Cohen, Jonathan W. Pillow
TL;DR
The paper tackles inferring both the dynamics $(A,B)$ and the optimal LQR costs $(Q,R)$ from partial, noisy observations of a system operating under closed-loop control. It treats the problem through marginal likelihood of a linear-Gaussian state-space model and leverages the EM algorithm to estimate the closed-loop dynamics, then derives conditions and procedures to disentangle system and cost parameters. A key finding is that infinite-horizon data with only state observations yields non-identifiability of the individual $(A,B,Q,R)$ components, but identifiability can be enhanced by moving to finite horizons or by incorporating partial control observations, with Kleinman-type iterations and Sylvester-DARE relations providing practical recovery procedures. These results offer a principled path to estimate both plant dynamics and control properties in partially observed, closed-loop settings, with potential applications to neuroscience and other domains where control processes are inferred from indirect measurements.
Abstract
We consider the joint problem of system identification and inverse optimal control for discrete-time stochastic Linear Quadratic Regulators. We analyze finite and infinite time horizons in a partially observed setting, where the state is observed noisily. To recover closed-loop system parameters, we develop inference methods based on probabilistic state-space model (SSM) techniques. First, we show that the system parameters exhibit non-identifiability in the infinite-horizon from closed-loop measurements, and we provide exact and numerical methods to disentangle the parameters. Second, to improve parameter identifiability, we show that we can further enhance recovery by either (1) incorporating additional partial measurements of the control signals or (2) moving to the finite-horizon setting. We further illustrate the performance of our methodology through numerical examples.
