Dual Averaging With Non-Strongly-Convex Prox-Functions: New Analysis and Algorithm
Renbo Zhao
TL;DR
The paper broadens the applicability of dual averaging (DA) to convex composite problems P(x)= f(Ax)+h(x) where h may be non-strongly convex. It develops two complementary analytical frameworks: (i) conditions under which the original DA maintains an O(1/k) primal–dual convergence rate, and (ii) a new DA-type method with dual monotonicity (MDA) that achieves the same rate under weaker assumptions, including open-domain h^*. It further introduces a Pasch-Hausdorff envelope-based extension F_L of f to handle cases where f is only convex and Lipschitz on the range C= A(dom h), enabling global Lipschitzness and tractable dual analysis, plus affine-invariance properties and practical certificates for the assumptions. The work links to Frank-Wolfe methods by viewing DA as a dual FW scheme under suitable conditions and extends FW-type analysis to a broader class of prox-functions, thereby enabling efficient first-order optimization for a wider array of non-strongly-convex prox-functions with provable rates.”
Abstract
We present new analysis and algorithm of the dual-averaging-type (DA-type) methods for solving the composite convex optimization problem ${\min}_{x\in\mathbb{R}^n} \, f(\mathsf{A} x) + h(x)$, where $f$ is a convex and globally Lipschitz function, $\mathsf{A}$ is a linear operator, and $h$ is a ``simple'' and convex function that is used as the prox-function in the DA-type methods. We open new avenues of analyzing and developing DA-type methods, by going beyond the canonical setting where the prox-function $h$ is assumed to be strongly convex (on its domain). To that end, we identify two new sets of assumptions on $h$ (and also $f$ and $\mathsf{A}$) and show that they hold broadly for many important classes of non-strongly-convex functions. Under the first set of assumptions, we show that the original DA method still has a $O(1/k)$ primal-dual convergence rate. Moreover, we analyze the affine invariance of this method and its convergence rate. Under the second set of assumptions, we develop a new DA-type method with dual monotonicity, and show that it has a $O(1/k)$ primal-dual convergence rate. Finally, we consider the case where $f$ is only convex and Lipschitz on $\mathcal{C}:=\mathsf{A}(\mathsf{dom} h)$, and construct its globally convex and Lipschitz extension based on the Pasch-Hausdorff envelope. Furthermore, we characterize the sub-differential and Fenchel conjugate of this extension using the convex analytic objects associated with $f$ and $\mathcal{C}$.
