Table of Contents
Fetching ...

Noise Sensitivity of the Semidefinite Programs for Direct Data-Driven LQR

Xiong Zeng, Laurent Bako, Necmiye Ozay

TL;DR

The paper reveals a fundamental instability in direct data-driven LQR formulations: when data are contaminated by noise, the certaintyequivalence SDP and its robustness-promoting variant both yield trivial zero-gain controllers in the large-sample limit, undermining statistical consistency. By reformulating the CE and RP problems and applying the Fundamental Lemma with persistently exciting data, the authors show that the optimal solutions satisfy linear relations that force the feedback gain to vanish with probability one in the presence of noise. Numerical experiments on a second-order unstable system corroborate the theory, showing zero gains for CE under noise and similar asymptotic behavior for RP with fixed regularization, though increasing regularization with data length can prevent collapse. The work underscores the need for truly robust, statistically sound direct data-driven control methods beyond naive CE or fixed-regularization RP formulations, with implications for the design and analysis of data-driven controllers in noisy environments.

Abstract

In this paper, we study the noise sensitivity of the semidefinite program (SDP) proposed for direct data-driven infinite-horizon linear quadratic regulator (LQR) problem for discrete-time linear time-invariant systems. While this SDP is shown to find the true LQR controller in the noise-free setting, we show that it leads to a trivial solution with zero gain matrices when data is corrupted by noise, even when the noise is arbitrarily small. We then study a variant of the SDP that includes a robustness promoting regularization term and prove that regularization does not fully eliminate the sensitivity issue. In particular, the solution of the regularized SDP converges in probability also to a trivial solution.

Noise Sensitivity of the Semidefinite Programs for Direct Data-Driven LQR

TL;DR

The paper reveals a fundamental instability in direct data-driven LQR formulations: when data are contaminated by noise, the certaintyequivalence SDP and its robustness-promoting variant both yield trivial zero-gain controllers in the large-sample limit, undermining statistical consistency. By reformulating the CE and RP problems and applying the Fundamental Lemma with persistently exciting data, the authors show that the optimal solutions satisfy linear relations that force the feedback gain to vanish with probability one in the presence of noise. Numerical experiments on a second-order unstable system corroborate the theory, showing zero gains for CE under noise and similar asymptotic behavior for RP with fixed regularization, though increasing regularization with data length can prevent collapse. The work underscores the need for truly robust, statistically sound direct data-driven control methods beyond naive CE or fixed-regularization RP formulations, with implications for the design and analysis of data-driven controllers in noisy environments.

Abstract

In this paper, we study the noise sensitivity of the semidefinite program (SDP) proposed for direct data-driven infinite-horizon linear quadratic regulator (LQR) problem for discrete-time linear time-invariant systems. While this SDP is shown to find the true LQR controller in the noise-free setting, we show that it leads to a trivial solution with zero gain matrices when data is corrupted by noise, even when the noise is arbitrarily small. We then study a variant of the SDP that includes a robustness promoting regularization term and prove that regularization does not fully eliminate the sensitivity issue. In particular, the solution of the regularized SDP converges in probability also to a trivial solution.
Paper Structure (16 sections, 12 theorems, 55 equations, 4 figures)

This paper contains 16 sections, 12 theorems, 55 equations, 4 figures.

Key Result

Theorem 1

Consider the data matrices $\mathbf{X}_0$, $\mathbf{U}_0$, $\mathbf{X}_1$ defined above and the feedback gain eq:kce. If $\operatorname{rank}\left(\left[\right]\right)=m+n$ and $\mathbf{w}_t=0$ for all $t$, then $\mathbf{K}_{ce} =\mathbf{K}_{\mathop{\mathrm{lqr}}\limits}$.

Figures (4)

  • Figure 1: Model-based (or indirect) and direct data-driven control algorithms give the same LQR gain, which is equal to the true LQR gain when the input data $\{ \mathbf{u}_t, \mathbf{x}_t \}_{t=0}^T$ is persistently exciting and comes from a noise-free system. The model-based algorithm is continuous (indeed locally Lipschitz continuous) with respect to its inputs (a property also known as algorithmic robustness), therefore its output degrades gracefully with a change in the input mania2019certainty. In this paper, we show that the direct data-driven LQR algorithm is discontinuous and even with arbitrary small noise (in almost all directions), the resulting gain reduces to the trivial gain of zero.
  • Figure 2: The $x$-axis is the length of the trajectory and the $y$-axis is the spectral norms of the feedback gains. The blue one is the spectral norm of the true LQR gain in \ref{['K_lqr_exp']}, the orange one is the spectral norm of the feedback gain $\mathbf{K}_{rp}$ by RP DDD LQR without noise, and the yellow one is the spectral norm of the feedback gain estimate of $\mathbf{K}_{rp}$ by RP DDD LQR with noise with a fixed $T$. The regularization parameter $\eta$ in RP DDD LQR is fixed for all $T$.
  • Figure 3: The $y$-axis is the norm of some optimal optimization variables. The blue one is the spectral norm of $\mathbf{U}_0 \mathbf{Y}_{rp}^*$, the orange one is the spectral norm of $\mathbf{X}_0 \mathbf{Y}_{rp}^*$, the yellow one is the spectral norm of $\mathbf{X}_1 \mathbf{Y}_{rp}^*$, where $\mathbf{Y}_{rp}^*$ is an optimal solution of RP DDD LQR in \ref{['RPDDDLQRSDP']}, and the purple one is the spectral norm of the feedback gain based on \ref{['RPDDDLQRSDP']} with a fixed $T$.
  • Figure 4: The only different setup of this figure with Fig. \ref{['fig_gain_change_fixed_eta']} is that we increase the regularization parameter $\eta$ in RP DDD LQR with the trajectory length $T$.

Theorems & Definitions (22)

  • Definition 1: Convergence in Probability
  • Definition 2: Persistency of Excitation
  • Theorem 1: Theorem 4 in de2019formulas
  • Lemma 1: Lemma 4 in de2021low
  • Theorem 2
  • Remark 1
  • Corollary 1
  • Lemma 2: Fundamental Lemma for Input-State Data, Theorem 1 in van2020willems
  • Lemma 3
  • proof
  • ...and 12 more