A randomized algorithm for nonconvex minimization with inexact evaluations and complexity guarantees

Shuyao Li; Stephen J. Wright

A randomized algorithm for nonconvex minimization with inexact evaluations and complexity guarantees

Shuyao Li, Stephen J. Wright

TL;DR

This work considers minimization of a smooth nonconvex function with inexact oracle access to gradient and Hessian to achieve approximate second-order optimality and applies its algorithm to empirical risk minimization problems and obtains improved gradient sample complexity over existing works.

Abstract

We consider minimization of a smooth nonconvex function with inexact oracle access to gradient and Hessian (without assuming access to the function value) to achieve approximate second-order optimality. A novel feature of our method is that if an approximate direction of negative curvature is chosen as the step, we choose its sense to be positive or negative with equal probability. We allow gradients to be inexact in a relative sense and relax the coupling between inexactness thresholds for the first- and second-order optimality conditions. Our convergence analysis includes both an expectation bound based on martingale analysis and a high-probability bound based on concentration inequalities. We apply our algorithm to empirical risk minimization problems and obtain improved gradient sample complexity over existing works.

A randomized algorithm for nonconvex minimization with inexact evaluations and complexity guarantees

TL;DR

Abstract

Paper Structure (15 sections, 10 theorems, 60 equations, 1 algorithm)

This paper contains 15 sections, 10 theorems, 60 equations, 1 algorithm.

Introduction
Prior Work
Approximate Second-Order Points.
Inexact derivatives.
Stochastic settings (including finite-sum problems \ref{['eq:fs']}).
Inexact settings beyond (sub)sampling.
Notation
A randomized algorithm with expected descent
Algorithm and Assumptions
Bound on expected stopping time
High-probability bound on stopping time
Computation of Inexact Eigenvectors and Eigenvalues
Comparison with previous results
Sampling
Conclusion

Key Result

Proposition 2.5

If Algorithm alg:inexact_randomized terminates and returns $x_{n}$, then $x_{n}$ is an $(\frac{4}{3}\epsilon_g, \frac{4}{3}\epsilon_{H})$ approximate second-order stationary point.

Theorems & Definitions (29)

Definition 1.1: Lipschitz continuity
Proposition 2.5
proof
Theorem 2.6
proof : Proof of Theorem \ref{['thm:expected_complexity']}
Remark 2.10
Theorem 2.11
Corollary 2.12: Short-Step Negative Curvature Updates
proof
Corollary 2.13
...and 19 more

A randomized algorithm for nonconvex minimization with inexact evaluations and complexity guarantees

TL;DR

Abstract

A randomized algorithm for nonconvex minimization with inexact evaluations and complexity guarantees

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (29)