Table of Contents
Fetching ...

The proximal point method revisited

Dmitriy Drusvyatskiy

TL;DR

This paper surveys how the proximal point method remains a powerful organizing principle for large-scale optimization by detailing three concrete directions: proximally guided subgradient methods for weakly convex stochastic problems, the prox-linear algorithm for structured composite objectives, and Catalyst-style acceleration for regularized ERM. It explains how proximal subproblems can be made well-conditioned, how local models lead to efficient subproblem solves, and how inertial acceleration can dramatically reduce overall complexity in finite-sum settings. The work highlights both theoretical guarantees (rates for convergence and local rapid convergence under regularity) and practical considerations (warm-starting, subproblem solvers, and variance reduction), underscoring the method’s relevance for modern large-scale optimization. Overall, proximal-point-inspired techniques are shown to yield practical, interpretable algorithms with strong convergence guarantees in nonconvex and weakly convex contexts. The results expand the toolkit for large-scale optimization in machine learning, signal processing, and related fields.

Abstract

In this short survey, I revisit the role of the proximal point method in large scale optimization. I focus on three recent examples: a proximally guided subgradient method for weakly convex stochastic approximation, the prox-linear algorithm for minimizing compositions of convex functions and smooth maps, and Catalyst generic acceleration for regularized Empirical Risk Minimization.

The proximal point method revisited

TL;DR

This paper surveys how the proximal point method remains a powerful organizing principle for large-scale optimization by detailing three concrete directions: proximally guided subgradient methods for weakly convex stochastic problems, the prox-linear algorithm for structured composite objectives, and Catalyst-style acceleration for regularized ERM. It explains how proximal subproblems can be made well-conditioned, how local models lead to efficient subproblem solves, and how inertial acceleration can dramatically reduce overall complexity in finite-sum settings. The work highlights both theoretical guarantees (rates for convergence and local rapid convergence under regularity) and practical considerations (warm-starting, subproblem solvers, and variance reduction), underscoring the method’s relevance for modern large-scale optimization. Overall, proximal-point-inspired techniques are shown to yield practical, interpretable algorithms with strong convergence guarantees in nonconvex and weakly convex contexts. The results expand the toolkit for large-scale optimization in machine learning, signal processing, and related fields.

Abstract

In this short survey, I revisit the role of the proximal point method in large scale optimization. I focus on three recent examples: a proximally guided subgradient method for weakly convex stochastic approximation, the prox-linear algorithm for minimizing compositions of convex functions and smooth maps, and Catalyst generic acceleration for regularized Empirical Risk Minimization.

Paper Structure

This paper contains 8 sections, 35 equations, 2 algorithms.

Theorems & Definitions (9)

  • Definition 2.1
  • Example 2.1: Additive composite
  • Example 2.2: Nonlinear least squares
  • Example 2.3: Exact penalty formulations
  • Example 2.4: Robust phase retrieval
  • Example 2.5: Robust PCA
  • Example 2.6: Censored $\mathbb{Z}_2$ synchronization
  • Definition 4.1: tilt
  • Definition 4.2: weak_sharp