Table of Contents
Fetching ...

Gradient Dominance in the Linear Quadratic Regulator: A Unified Analysis for Continuous-Time and Discrete-Time Systems

Yuto Watanabe, Yang Zheng

Abstract

Despite its nonconvexity, policy optimization for the Linear Quadratic Regulator (LQR) admits a favorable structural property known as gradient dominance, which facilitates linear convergence of policy gradient methods to the globally optimal gain. While gradient dominance has been extensively studied, continuous-time and discrete-time LQRs have largely been analyzed separately, relying on slightly different assumptions, proof strategies, and resulting guarantees. In this paper, we present a unified gradient dominance property for both continuous-time and discrete-time LQRs under mild stabilizability and detectability assumptions. Our analysis is based on a convex reformulation derived from a common Lyapunov inequality representation and a unified change-of-variables procedure. This convex-lifting perspective yields a single proof framework applicable to both time models. The unified treatment clarifies how differences between continuous-time and discrete-time dynamics influence theoretical guarantees and reveals a deeper structural symmetry between the two formulations. Numerical examples illustrate and support the theoretical findings.

Gradient Dominance in the Linear Quadratic Regulator: A Unified Analysis for Continuous-Time and Discrete-Time Systems

Abstract

Despite its nonconvexity, policy optimization for the Linear Quadratic Regulator (LQR) admits a favorable structural property known as gradient dominance, which facilitates linear convergence of policy gradient methods to the globally optimal gain. While gradient dominance has been extensively studied, continuous-time and discrete-time LQRs have largely been analyzed separately, relying on slightly different assumptions, proof strategies, and resulting guarantees. In this paper, we present a unified gradient dominance property for both continuous-time and discrete-time LQRs under mild stabilizability and detectability assumptions. Our analysis is based on a convex reformulation derived from a common Lyapunov inequality representation and a unified change-of-variables procedure. This convex-lifting perspective yields a single proof framework applicable to both time models. The unified treatment clarifies how differences between continuous-time and discrete-time dynamics influence theoretical guarantees and reveals a deeper structural symmetry between the two formulations. Numerical examples illustrate and support the theoretical findings.
Paper Structure (27 sections, 16 theorems, 120 equations, 2 figures, 1 table)

This paper contains 27 sections, 16 theorems, 120 equations, 2 figures, 1 table.

Key Result

Theorem 1

Consider the continuous-time LQR eq:LQR_continuous-time. Under assumption:stabilizable, the globally optimal input is unique, given by the stabilizing state feedback policy

Figures (2)

  • Figure 1: Gradient dominance in the discrete-time LQR: (a) Landscape in \ref{['example:PL_Q-PSD_AB-stabilizable']}, where $W\succ0$ but $Q\nsucc 0$ and $(A,B)$ is only stabilizable; (b) Landscape in \ref{['example:PL_WQ-PSD']}, where $(A,B)$ is controllable, and both $Q$ and $W$ are only positive semidefinite. In both figures, the red dots denote the optimal gain $K^\star$.
  • Figure 2: Non-unique optimal LQR gains when \ref{['assumption:X_positive-definite']} fails: (a) The continuous-time case in \ref{['example:PL_fail_ct']}; (b) The discrete-time case in \ref{['example:PL_fail_dt']}. In both cases, the optimal value of $J$ is achieved at any points over the red line, indicating the non-uniqueness of the optimal LQR gains.

Theorems & Definitions (29)

  • Theorem 1: Optimal input in continuous time
  • Theorem 2: Optimal input in discrete time
  • Lemma 1
  • Theorem 3: Unified gradient dominance with a non-uniform constant
  • Corollary 1: Uniform constant $\mu$ over a compact set
  • proof
  • Corollary 2: Global gradient dominance for discrete-time systems
  • proof
  • Proposition 1
  • proof
  • ...and 19 more