Table of Contents
Fetching ...

Dual Averaging With Non-Strongly-Convex Prox-Functions: New Analysis and Algorithm

Renbo Zhao

TL;DR

The paper broadens the applicability of dual averaging (DA) to convex composite problems P(x)= f(Ax)+h(x) where h may be non-strongly convex. It develops two complementary analytical frameworks: (i) conditions under which the original DA maintains an O(1/k) primal–dual convergence rate, and (ii) a new DA-type method with dual monotonicity (MDA) that achieves the same rate under weaker assumptions, including open-domain h^*. It further introduces a Pasch-Hausdorff envelope-based extension F_L of f to handle cases where f is only convex and Lipschitz on the range C= A(dom h), enabling global Lipschitzness and tractable dual analysis, plus affine-invariance properties and practical certificates for the assumptions. The work links to Frank-Wolfe methods by viewing DA as a dual FW scheme under suitable conditions and extends FW-type analysis to a broader class of prox-functions, thereby enabling efficient first-order optimization for a wider array of non-strongly-convex prox-functions with provable rates.”

Abstract

We present new analysis and algorithm of the dual-averaging-type (DA-type) methods for solving the composite convex optimization problem ${\min}_{x\in\mathbb{R}^n} \, f(\mathsf{A} x) + h(x)$, where $f$ is a convex and globally Lipschitz function, $\mathsf{A}$ is a linear operator, and $h$ is a ``simple'' and convex function that is used as the prox-function in the DA-type methods. We open new avenues of analyzing and developing DA-type methods, by going beyond the canonical setting where the prox-function $h$ is assumed to be strongly convex (on its domain). To that end, we identify two new sets of assumptions on $h$ (and also $f$ and $\mathsf{A}$) and show that they hold broadly for many important classes of non-strongly-convex functions. Under the first set of assumptions, we show that the original DA method still has a $O(1/k)$ primal-dual convergence rate. Moreover, we analyze the affine invariance of this method and its convergence rate. Under the second set of assumptions, we develop a new DA-type method with dual monotonicity, and show that it has a $O(1/k)$ primal-dual convergence rate. Finally, we consider the case where $f$ is only convex and Lipschitz on $\mathcal{C}:=\mathsf{A}(\mathsf{dom} h)$, and construct its globally convex and Lipschitz extension based on the Pasch-Hausdorff envelope. Furthermore, we characterize the sub-differential and Fenchel conjugate of this extension using the convex analytic objects associated with $f$ and $\mathcal{C}$.

Dual Averaging With Non-Strongly-Convex Prox-Functions: New Analysis and Algorithm

TL;DR

The paper broadens the applicability of dual averaging (DA) to convex composite problems P(x)= f(Ax)+h(x) where h may be non-strongly convex. It develops two complementary analytical frameworks: (i) conditions under which the original DA maintains an O(1/k) primal–dual convergence rate, and (ii) a new DA-type method with dual monotonicity (MDA) that achieves the same rate under weaker assumptions, including open-domain h^*. It further introduces a Pasch-Hausdorff envelope-based extension F_L of f to handle cases where f is only convex and Lipschitz on the range C= A(dom h), enabling global Lipschitzness and tractable dual analysis, plus affine-invariance properties and practical certificates for the assumptions. The work links to Frank-Wolfe methods by viewing DA as a dual FW scheme under suitable conditions and extends FW-type analysis to a broader class of prox-functions, thereby enabling efficient first-order optimization for a wider array of non-strongly-convex prox-functions with provable rates.”

Abstract

We present new analysis and algorithm of the dual-averaging-type (DA-type) methods for solving the composite convex optimization problem , where is a convex and globally Lipschitz function, is a linear operator, and is a ``simple'' and convex function that is used as the prox-function in the DA-type methods. We open new avenues of analyzing and developing DA-type methods, by going beyond the canonical setting where the prox-function is assumed to be strongly convex (on its domain). To that end, we identify two new sets of assumptions on (and also and ) and show that they hold broadly for many important classes of non-strongly-convex functions. Under the first set of assumptions, we show that the original DA method still has a primal-dual convergence rate. Moreover, we analyze the affine invariance of this method and its convergence rate. Under the second set of assumptions, we develop a new DA-type method with dual monotonicity, and show that it has a primal-dual convergence rate. Finally, we consider the case where is only convex and Lipschitz on , and construct its globally convex and Lipschitz extension based on the Pasch-Hausdorff envelope. Furthermore, we characterize the sub-differential and Fenchel conjugate of this extension using the convex analytic objects associated with and .

Paper Structure

This paper contains 28 sections, 36 theorems, 141 equations, 2 algorithms.

Key Result

Lemma 2.1

Under Assumption assum:h, $\mathsf{int}\,\mathsf{dom}\, h^*\ne \emptyset$ and $h^*$ is continuously differentiable on $\mathsf{int}\,\mathsf{dom}\, h^*$. In addition, if $\mathsf{dom}\, h$ is non-singleton, then $h$ is strictly convex on $\mathsf{dom}\, h$.

Theorems & Definitions (92)

  • Remark 2.1: Effects of $\Vert\cdot\Vert_\mathbb{X}$ on $\mu_\mathcal{S}$
  • Lemma 2.1
  • proof
  • Remark 2.2: Verifying Assumption \ref{['assum:Q']}
  • Remark 2.3: Well-Definedness of Algorithm \ref{['algo:DA']}
  • Lemma 2.2
  • proof
  • Lemma 2.3
  • proof
  • Lemma 2.4
  • ...and 82 more