Table of Contents
Fetching ...

3DIOC: Direct Data-Driven Inverse Optimal Control for LTI Systems

Chendi Qu, Jianping He, Xiaoming Duan

TL;DR

This work tackles inverse optimal control for linear time-invariant systems by learning the finite-horizon LQ objective weights $Q$ and $R$ directly from input-output trajectories, without identifying the system dynamics. It builds a data-enabled, model-free IOC framework using the Fundamental Lemma to obtain an input-output representation and derives model-free KKT conditions that connect data blocks to the unknown $Q$ and $R$. The authors propose Vanilla and Simplified 3DIOC formulations with identifiability and perturbation analyses, including a special LQR-IOC case, and demonstrate computational efficiency and robustness through simulations. The approach reduces data and computation while providing guarantees on identifiability and sensitivity to noise, with potential extensions to process-noise and Koopman-based nonlinear settings.

Abstract

This paper develops a direct data-driven inverse optimal control (3DIOC) algorithm for the linear time-invariant (LTI) system who conducts a linear quadratic (LQ) control, where the underlying objective function is learned directly from measured input-output trajectories without system identification. By introducing the Fundamental Lemma, we establish the input-output representation of the LTI system. We accordingly propose a model-free optimality necessary condition for the forward LQ problem to build a connection between the objective function and collected data, with which the inverse optimal control problem is solved. We further improve the algorithm so that it requires a less computation and data. Identifiability condition and perturbation analysis are provided. Simulations demonstrate the efficiency and performance of our algorithms.

3DIOC: Direct Data-Driven Inverse Optimal Control for LTI Systems

TL;DR

This work tackles inverse optimal control for linear time-invariant systems by learning the finite-horizon LQ objective weights and directly from input-output trajectories, without identifying the system dynamics. It builds a data-enabled, model-free IOC framework using the Fundamental Lemma to obtain an input-output representation and derives model-free KKT conditions that connect data blocks to the unknown and . The authors propose Vanilla and Simplified 3DIOC formulations with identifiability and perturbation analyses, including a special LQR-IOC case, and demonstrate computational efficiency and robustness through simulations. The approach reduces data and computation while providing guarantees on identifiability and sensitivity to noise, with potential extensions to process-noise and Koopman-based nonlinear settings.

Abstract

This paper develops a direct data-driven inverse optimal control (3DIOC) algorithm for the linear time-invariant (LTI) system who conducts a linear quadratic (LQ) control, where the underlying objective function is learned directly from measured input-output trajectories without system identification. By introducing the Fundamental Lemma, we establish the input-output representation of the LTI system. We accordingly propose a model-free optimality necessary condition for the forward LQ problem to build a connection between the objective function and collected data, with which the inverse optimal control problem is solved. We further improve the algorithm so that it requires a less computation and data. Identifiability condition and perturbation analysis are provided. Simulations demonstrate the efficiency and performance of our algorithms.
Paper Structure (14 sections, 11 theorems, 54 equations, 3 figures, 1 table)

This paper contains 14 sections, 11 theorems, 54 equations, 3 figures, 1 table.

Key Result

Lemma 1

(Fundamental Lemma willems2005note) Consider the linear time-invariant system $\mathscr{B}$ in sys. If a) system $\mathscr{B}$ is controllable, b) $w^d \in \mathscr{B}|_T$, c) the inputs $u^d$ is persistently exciting of order $L+n$, then we have where

Figures (3)

  • Figure 1: The flow chart of proposed problems.
  • Figure 2: Estimation errors to $Q,R$ without noises. At each $T_{ini}$ we conduct the simulation for $15$ times. The blue scatters represent the estimation error at each experiment. The purple line is the mean error.
  • Figure 3: Estimation errors with the presence of observation noises. We conduct the simulation $15$ times for each variance. The orange scatters are errors at each experiment. The solid red curve represents mean of the error and the dashed line for variance.

Theorems & Definitions (18)

  • Definition 1
  • Lemma 1
  • Lemma 2
  • Lemma 3
  • Proposition 1
  • Proposition 2
  • Lemma 4: Model-free KKT
  • Definition 2: Scalar Ambiguity
  • Theorem 1: Identifiability
  • proof
  • ...and 8 more