First- and Second-Order Stochastic Adaptive Regularization with Cubics: High Probability Iteration and Sample Complexity

Katya Scheinberg; Miaolan Xie

First- and Second-Order Stochastic Adaptive Regularization with Cubics: High Probability Iteration and Sample Complexity

Katya Scheinberg, Miaolan Xie

TL;DR

This work tackles unconstrained nonconvex optimization where function values and derivatives are accessed through stochastic oracles. It develops two stochastic Adaptive Regularization with Cubics (SARC) algorithms, one first-order and one second-order, and proves that they achieve the deterministic $O(\varepsilon^{-3/2})$ iteration rate with high probability, while handling biased and arbitrary zeroth-order errors via an error-corrected acceptance rule. The analysis introduces true-iteration concepts, stochastic process framing, and tailored oracle-accuracy settings, establishing both high-probability and in-expectation iteration bounds, plus novel high-probability and expectation-based sample complexity results. The results show that SARC variants retain optimal iteration complexity and provide explicit, problem-dependent minibatch-based sample complexities for expectation minimization, highlighting their theoretical advantage over other stochastic adaptive methods.

Abstract

We present high-probability (and expectation) complexity bounds for two versions of stochastic adaptive regularization methods with cubics (SARC), also known as regularized Newton methods. The first algorithm aims to find first-order stationary points, while the second targets second-order optimality conditions. Both methods employ stochastic zeroth-, first-, and second-order oracles with specific accuracy and reliability requirements. These oracles, which have been previously used with other stochastic adaptive methods like trust-region and line-search algorithms, are applicable to various optimization settings including expected risk minimization and simulation optimization. In this paper, we establish the first high-probability iteration and sample complexity bounds for both first- and second-order SARC algorithms. Our analysis demonstrates that as in the deterministic case, they outperform other stochastic adaptive methods.

First- and Second-Order Stochastic Adaptive Regularization with Cubics: High Probability Iteration and Sample Complexity

TL;DR

iteration rate with high probability, while handling biased and arbitrary zeroth-order errors via an error-corrected acceptance rule. The analysis introduces true-iteration concepts, stochastic process framing, and tailored oracle-accuracy settings, establishing both high-probability and in-expectation iteration bounds, plus novel high-probability and expectation-based sample complexity results. The results show that SARC variants retain optimal iteration complexity and provide explicit, problem-dependent minibatch-based sample complexities for expectation minimization, highlighting their theoretical advantage over other stochastic adaptive methods.

Abstract

Paper Structure (9 sections, 19 theorems, 80 equations)

This paper contains 9 sections, 19 theorems, 80 equations.

Introduction
First-Order SARC
Deterministic Properties of Algorithm 1
Stochastic Properties of Algorithm 1
High-Probability and Expected First-Order Iteration Complexity
Second-Order SARC
High-Probability and Expected Second-Order Iteration Complexity
Sample Complexity
High Probability First-Order and Second-Order Sample Complexity Result

Key Result

Lemma 1

Consider any realization of Algorithm alg:ARC_Random. For each iteration $k$, we have On every successful iteration $k$, we have which implies

Theorems & Definitions (41)

Definition 1: True iteration
Remark 1
Lemma 1: Improvement on successful iterations
proof
Lemma 2: Large $\sigma_k$ guarantees success or small step
proof
Lemma 3: Lower bound on step norm in terms of $\|\nabla \phi(x_k^+)\|$
proof
Lemma 4: Lower bound on step norm until $\epsilon$-accuracy is reached
proof
...and 31 more

First- and Second-Order Stochastic Adaptive Regularization with Cubics: High Probability Iteration and Sample Complexity

TL;DR

Abstract

First- and Second-Order Stochastic Adaptive Regularization with Cubics: High Probability Iteration and Sample Complexity

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (41)