Inexact and Implementable Accelerated Newton Proximal Extragradient Method for Convex Optimization

Ziyu Huang; Bo Jiang; Yuntian Jiang

Inexact and Implementable Accelerated Newton Proximal Extragradient Method for Convex Optimization

Ziyu Huang, Bo Jiang, Yuntian Jiang

TL;DR

This paper introduces the inexact A-NPE method (IA-NPE), which is shown to maintain the near-optimal oracle complexity and shows the robustness of the line-search procedure, which is a subroutine in IA-NPE, in the face of the inexactness of the Hessian.

Abstract

In this paper, we investigate the convergence behavior of the Accelerated Newton Proximal Extragradient (A-NPE) method when employing inexact Hessian information. The exact A-NPE method was the pioneer near-optimal second-order approach, exhibiting an oracle complexity of $\Tilde{O}(ε^{-2/7})$ for convex optimization. Despite its theoretical optimality, there has been insufficient attention given to the study of its inexact version and efficient implementation. We introduce the inexact A-NPE method (IA-NPE), which is shown to maintain the near-optimal oracle complexity. In particular, we design a dynamic approach to balance the computational cost of constructing the Hessian matrix and the progress of the convergence. Moreover, we show the robustness of the line-search procedure, which is a subroutine in IA-NPE, in the face of the inexactness of the Hessian. These nice properties enable the implementation of highly effective machine learning techniques like sub-sampling and various heuristics in the method. Extensive numerical results illustrate that IA-NPE compares favorably with state-of-the-art second-order methods, including Newton's method with cubic regularization and Trust-Region methods.

Inexact and Implementable Accelerated Newton Proximal Extragradient Method for Convex Optimization

TL;DR

Abstract

for convex optimization. Despite its theoretical optimality, there has been insufficient attention given to the study of its inexact version and efficient implementation. We introduce the inexact A-NPE method (IA-NPE), which is shown to maintain the near-optimal oracle complexity. In particular, we design a dynamic approach to balance the computational cost of constructing the Hessian matrix and the progress of the convergence. Moreover, we show the robustness of the line-search procedure, which is a subroutine in IA-NPE, in the face of the inexactness of the Hessian. These nice properties enable the implementation of highly effective machine learning techniques like sub-sampling and various heuristics in the method. Extensive numerical results illustrate that IA-NPE compares favorably with state-of-the-art second-order methods, including Newton's method with cubic regularization and Trust-Region methods.

Paper Structure (19 sections, 21 theorems, 116 equations, 3 figures, 1 table, 2 algorithms)

This paper contains 19 sections, 21 theorems, 116 equations, 3 figures, 1 table, 2 algorithms.

Introduction
Preliminaries
Overview of the IA-NPE Method
The IA-NPE Method and the Approximate Solution to the Subproblem
Alternative Representation of Approximate Newton Solution
Complexity of the Line-search Procedure
Preliminary Results
Analysis of The Bracketing Points
Complexity of the Bisection Stage
Complexity Analysis of the IA-NPE Method
Convergence of the main loop in \ref{['alg.main alg']}
Total Complexity of the Algorithm
Subroutines for Approximating the Hessian
Subspace Approximation
Sub-Sampling Approximation
...and 4 more sections

Key Result

Proposition 3.1

Let $(\lambda,x) \in \mathbb{R}_{++} \times \mathbb{R}^d$ and a $(\hat{\sigma},\delta)$-approximate Newton solution $(y,u,\epsilon)$ at $(\lambda,x)$ be given, and define $v:=\mathcal{G}(y)+u-\mathcal{G}_{x,\delta}(y)$.Then, and

Figures (3)

Figure 1: Log-scale norm of gradient v.s. time.
Figure 2: Log-scale norm of gradient v.s. No. epochs.
Figure 3: Log-scale norm of gradient v.s. No. epochs.

Theorems & Definitions (31)

Definition 2.1
Definition 2.2
Definition 2.3
Definition 3.1
Definition 3.2
Remark 3.1
Definition 3.3
Definition 3.4
Definition 3.5
Proposition 3.1
...and 21 more

Inexact and Implementable Accelerated Newton Proximal Extragradient Method for Convex Optimization

TL;DR

Abstract

Inexact and Implementable Accelerated Newton Proximal Extragradient Method for Convex Optimization

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (31)