Ergodic-Risk Constrained Policy Optimization: The Linear Quadratic Case

Shahriar Talebi; Na Li

Ergodic-Risk Constrained Policy Optimization: The Linear Quadratic Case

Shahriar Talebi, Na Li

TL;DR

This work addresses long-horizon risk in stochastic control with heavy-tailed noise by introducing ergodic-risk criteria that capture cumulative uncertainty through $C_\infty$ and its asymptotic variance $\gamma_N^2$. For LTI systems, it derives a quadratic ergodic-risk formulation and a tractable surrogate $\gamma_N^2(K)$, enabling a constrained LQR problem that minimizes the average cost $J(K)=\mathrm{tr}(Q_K \Sigma_K)$ subject to $\gamma_N^2(K)\le\bar{\beta}$. A primal-dual algorithm leveraging strong duality and a fast inner loop computes an optimal stabilizing policy $K^*(\lambda)$ with provable convergence rates. Numerical experiments on a Grumman X-29 model with heavy-tailed noise demonstrate a modest increase in average cost but enhanced resilience to large disturbances, highlighting the method’s practical value for risk-aware control in uncertain environments.

Abstract

Risk-sensitive control balances performance with resilience to unlikely events in uncertain systems. This paper introduces ergodic-risk criteria, which capture long-term cumulative risks through probabilistic limit theorems. By ensuring the dynamics exhibit strong ergodicity, we demonstrate that the time-correlated terms in these limiting criteria converge even with potentially heavy-tailed process noises as long as the noise has a finite fourth moment. Building upon this, we proposed the ergodic-risk constrained policy optimization which incorporates an ergodic-risk constraint to the classical Linear Quadratic Regulation (LQR) framework. We then propose a primal-dual policy optimization method that optimizes the average performance while satisfying the ergodic-risk constraints. Numerical results demonstrate that the new risk-constrained LQR not only optimizes average performance but also limits the asymptotic variance associated with the ergodic-risk criterion, making the closed-loop system more robust against sporadic large fluctuations in process noise.

Ergodic-Risk Constrained Policy Optimization: The Linear Quadratic Case

TL;DR

Abstract

Ergodic-Risk Constrained Policy Optimization: The Linear Quadratic Case

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (5)