Remarks on the Polyak-Lojasiewicz inequality and the convergence of gradient systems
Arthur Castello B. de Oliveira, Leilei Cui, Eduardo D. Sontag
TL;DR
The paper develops a framework for generalizing the Polyak-Łojasiewicz inequality using nonlinear comparison functions and analyzes how these generalizations shape gradient-flow convergence. It then applies the framework to continuous-time LQR policy optimization, proving that CT-LQR cannot satisfy a global $\mathrm{PLI}$ and exhibits region-dependent convergence—bounded along high-gain trajectories yet unbounded near the stability boundary. A scalar LQR study clarifies dual convergence regimes: exponential-like convergence near the optimum and explosive sensitivity near the stability border. The results illuminate how weaker PL conditions govern convergence profiles and motivate future work on proximal-gradient methods with $L_1$ regularization. Overall, the work clarifies the limitations of global PL guarantees for CT-LQR and provides a nuanced view of gradient-flow dynamics under generalized PL inequalities.
Abstract
This work explores generalizations of the Polyak-Lojasiewicz inequality (PLI) and their implications for the convergence behavior of gradient flows in optimization problems. Motivated by the continuous-time linear quadratic regulator (CT-LQR) policy optimization problem -- where only a weaker version of the PLI is characterized in the literature -- this work shows that while weaker conditions are sufficient for global convergence to, and optimality of the set of critical points of the cost function, the "profile" of the gradient flow solution can change significantly depending on which "flavor" of inequality the cost satisfies. After a general theoretical analysis, we focus on fitting the CT-LQR policy optimization problem to the proposed framework, showing that, in fact, it can never satisfy a PLI in its strongest form. We follow up our analysis with a brief discussion on the difference between continuous- and discrete-time LQR policy optimization, and end the paper with some intuition on the extension of this framework to optimization problems with L1 regularization and solved through proximal gradient flows.
