Policy Optimization in Robust Control: Weak Convexity and Subgradient Methods

Yuto Watanabe; Feng-Yi Liao; Yang Zheng

Policy Optimization in Robust Control: Weak Convexity and Subgradient Methods

Yuto Watanabe, Feng-Yi Liao, Yang Zheng

TL;DR

This work analyzes discrete-time ${\rm H}_\infty$ policy optimization with static output-feedback, proving that the cost $J(K)$ is locally/regionally weakly convex on sublevel sets and, in the full-state case, satisfies a weak Polyak-Łojasiewicz inequality, guaranteeing global optimality of stationary points. It develops a simple subgradient method for this nonsmooth, nonconvex problem and establishes the first deterministic non-asymptotic convergence rate via Moreau envelopes, under mild boundedness assumptions. The analysis leverages a lower-$C^2$ structure and a complex-SDP representation to show uniform weak convexity on convex subsets of sublevel sets and to connect spectral functions with convex compositions. Numerical experiments validate the theory, showing feasibility and convergence for full-state feedback and demonstrating landscape complexities (e.g., saddles) in static output-feedback, with implications for safe, model-free robust control design.

Abstract

Robust control seeks stabilizing policies that perform reliably under adversarial disturbances, with $\mathcal{H}_\infty$ control as a classical formulation. It is known that policy optimization of robust $\mathcal{H}_\infty$ control naturally lead to nonsmooth and nonconvex problems. This paper builds on recent advances in nonsmooth optimization to analyze discrete-time static output-feedback $\mathcal{H}_\infty$ control. We show that the $\mathcal{H}_\infty$ cost is weakly convex over any convex subset of a sublevel set. This structural property allows us to establish the first non-asymptotic deterministic convergence rate for the subgradient method under suitable assumptions. In addition, we prove a weak Polyak-Łojasiewicz (PL) inequality in the state-feedback case, implying that all stationary points are globally optimal. We finally present a few numerical examples to validate the theoretical results.

Policy Optimization in Robust Control: Weak Convexity and Subgradient Methods

TL;DR

This work analyzes discrete-time

policy optimization with static output-feedback, proving that the cost

is locally/regionally weakly convex on sublevel sets and, in the full-state case, satisfies a weak Polyak-Łojasiewicz inequality, guaranteeing global optimality of stationary points. It develops a simple subgradient method for this nonsmooth, nonconvex problem and establishes the first deterministic non-asymptotic convergence rate via Moreau envelopes, under mild boundedness assumptions. The analysis leverages a lower-

structure and a complex-SDP representation to show uniform weak convexity on convex subsets of sublevel sets and to connect spectral functions with convex compositions. Numerical experiments validate the theory, showing feasibility and convergence for full-state feedback and demonstrating landscape complexities (e.g., saddles) in static output-feedback, with implications for safe, model-free robust control design.

Abstract

Robust control seeks stabilizing policies that perform reliably under adversarial disturbances, with

control as a classical formulation. It is known that policy optimization of robust

control naturally lead to nonsmooth and nonconvex problems. This paper builds on recent advances in nonsmooth optimization to analyze discrete-time static output-feedback

control. We show that the

cost is weakly convex over any convex subset of a sublevel set. This structural property allows us to establish the first non-asymptotic deterministic convergence rate for the subgradient method under suitable assumptions. In addition, we prove a weak Polyak-Łojasiewicz (PL) inequality in the state-feedback case, implying that all stationary points are globally optimal. We finally present a few numerical examples to validate the theoretical results.

Policy Optimization in Robust Control: Weak Convexity and Subgradient Methods

TL;DR

Abstract

Policy Optimization in Robust Control: Weak Convexity and Subgradient Methods

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (21)