Improving Convergence Guarantees of Random Subspace Second-order Algorithm for Nonconvex Optimization

Rei Higuchi; Pierre-Louis Poirion; Akiko Takeda

Improving Convergence Guarantees of Random Subspace Second-order Algorithm for Nonconvex Optimization

Rei Higuchi, Pierre-Louis Poirion, Akiko Takeda

TL;DR

The Random Subspace Homogenized Trust Region (RSHTR) method is proposed, which achieves an $\varepsilon-approximate first-order stationary point in $O(\varepsilon^{-3/2})$ iterations, converging locally at a linear rate and exhibits a local quadratic convergence.

Abstract

In recent years, random subspace methods have been actively studied for large-dimensional nonconvex problems. Recent subspace methods have improved theoretical guarantees such as iteration complexity and local convergence rate while reducing computational costs by deriving descent directions in randomly selected low-dimensional subspaces. This paper proposes the Random Subspace Homogenized Trust Region (RSHTR) method with the best theoretical guarantees among random subspace algorithms for nonconvex optimization. RSHTR achieves an $\varepsilon$-approximate first-order stationary point in $O(\varepsilon^{-3/2})$ iterations, converging locally at a linear rate. Furthermore, under rank-deficient conditions, RSHTR satisfies $\varepsilon$-approximate second-order necessary conditions in $O(\varepsilon^{-3/2})$ iterations and exhibits a local quadratic convergence. Experiments on real-world datasets verify the benefits of RSHTR.

Improving Convergence Guarantees of Random Subspace Second-order Algorithm for Nonconvex Optimization

TL;DR

The Random Subspace Homogenized Trust Region (RSHTR) method is proposed, which achieves an

O(\varepsilon^{-3/2})$ iterations, converging locally at a linear rate and exhibits a local quadratic convergence.

Abstract

-approximate first-order stationary point in

iterations, converging locally at a linear rate. Furthermore, under rank-deficient conditions, RSHTR satisfies

-approximate second-order necessary conditions in

iterations and exhibits a local quadratic convergence. Experiments on real-world datasets verify the benefits of RSHTR.

Paper Structure (55 sections, 35 theorems, 174 equations, 15 figures, 2 tables, 6 algorithms)

This paper contains 55 sections, 35 theorems, 174 equations, 15 figures, 2 tables, 6 algorithms.

Introduction
Overview of existing random subspace methods:
Our research idea:
Contribution:
Existing random subspace algorithms for nonconvex optimization
Notations.
Proposed method
Existing algorithm: HSODM
Random Subspace Homogenized Trust Region: RSHTR
Total computational complexity and space complexity
Theoretical analysis
Global convergence to an $\varepsilon$--FOSP
Global convergence to an $\varepsilon$--SOSP
Local linear convergence
Local convergence for strongly convex $f$ in its effective subspace
...and 40 more sections

Key Result

Lemma 3.1

Suppose that Assumption asmp:lips holds. If $\|d_{k}\| > \Delta$, then for all $\delta>0$ with probability at least $1 - 2 \exp(-s)$. Here $\mathcal{C}$See Wainwright_2019 for more details. is an absolute constant.

Figures (15)

Figure 1: Illustration of our random subspace method on $\mathbb{R}^2$. Each iteration restricts the update to a $1$-dim. randomly selected subspace.
Figure 2: Log plot of the convergence of RSHTR on low effective Rosenbrock problems. The subspace dimension $s$ is fixed at 100, and the problem rank $r$ is varied ($r = 25, 50, 100, 150$).
Figure 3: The impact of the choice of subspace dimension $s ~ (=50, 100, 200)$ on convergence in random subspace algorithms (RSGD, RSRN, RSHTR) for MF.
Figure 4: LER (dim: 10,000)
Figure 5: MF: MovieLens (dim: 131,250)
...and 10 more figures

Theorems & Definitions (62)

Lemma 3.1
Lemma 3.2
Lemma 3.3
Theorem 3.1: Global convergence to an $\varepsilon$--FOSP
Corollary 3.1
Lemma 3.4
Theorem 3.2: Global convergence to an $\varepsilon$--SOSP under rank deficiency
Theorem 3.3: Local linear convergence
Theorem 3.4: Local quadratic convergence under $\rho$--strong convexity in effective subspace
Lemma B.1
...and 52 more

Improving Convergence Guarantees of Random Subspace Second-order Algorithm for Nonconvex Optimization

TL;DR

Abstract

Improving Convergence Guarantees of Random Subspace Second-order Algorithm for Nonconvex Optimization

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (15)

Theorems & Definitions (62)