Table of Contents
Fetching ...

Inexact and Implementable Accelerated Newton Proximal Extragradient Method for Convex Optimization

Ziyu Huang, Bo Jiang, Yuntian Jiang

TL;DR

This paper introduces the inexact A-NPE method (IA-NPE), which is shown to maintain the near-optimal oracle complexity and shows the robustness of the line-search procedure, which is a subroutine in IA-NPE, in the face of the inexactness of the Hessian.

Abstract

In this paper, we investigate the convergence behavior of the Accelerated Newton Proximal Extragradient (A-NPE) method when employing inexact Hessian information. The exact A-NPE method was the pioneer near-optimal second-order approach, exhibiting an oracle complexity of $\Tilde{O}(ε^{-2/7})$ for convex optimization. Despite its theoretical optimality, there has been insufficient attention given to the study of its inexact version and efficient implementation. We introduce the inexact A-NPE method (IA-NPE), which is shown to maintain the near-optimal oracle complexity. In particular, we design a dynamic approach to balance the computational cost of constructing the Hessian matrix and the progress of the convergence. Moreover, we show the robustness of the line-search procedure, which is a subroutine in IA-NPE, in the face of the inexactness of the Hessian. These nice properties enable the implementation of highly effective machine learning techniques like sub-sampling and various heuristics in the method. Extensive numerical results illustrate that IA-NPE compares favorably with state-of-the-art second-order methods, including Newton's method with cubic regularization and Trust-Region methods.

Inexact and Implementable Accelerated Newton Proximal Extragradient Method for Convex Optimization

TL;DR

This paper introduces the inexact A-NPE method (IA-NPE), which is shown to maintain the near-optimal oracle complexity and shows the robustness of the line-search procedure, which is a subroutine in IA-NPE, in the face of the inexactness of the Hessian.

Abstract

In this paper, we investigate the convergence behavior of the Accelerated Newton Proximal Extragradient (A-NPE) method when employing inexact Hessian information. The exact A-NPE method was the pioneer near-optimal second-order approach, exhibiting an oracle complexity of for convex optimization. Despite its theoretical optimality, there has been insufficient attention given to the study of its inexact version and efficient implementation. We introduce the inexact A-NPE method (IA-NPE), which is shown to maintain the near-optimal oracle complexity. In particular, we design a dynamic approach to balance the computational cost of constructing the Hessian matrix and the progress of the convergence. Moreover, we show the robustness of the line-search procedure, which is a subroutine in IA-NPE, in the face of the inexactness of the Hessian. These nice properties enable the implementation of highly effective machine learning techniques like sub-sampling and various heuristics in the method. Extensive numerical results illustrate that IA-NPE compares favorably with state-of-the-art second-order methods, including Newton's method with cubic regularization and Trust-Region methods.
Paper Structure (19 sections, 21 theorems, 116 equations, 3 figures, 1 table, 2 algorithms)

This paper contains 19 sections, 21 theorems, 116 equations, 3 figures, 1 table, 2 algorithms.

Key Result

Proposition 3.1

Let $(\lambda,x) \in \mathbb{R}_{++} \times \mathbb{R}^d$ and a $(\hat{\sigma},\delta)$-approximate Newton solution $(y,u,\epsilon)$ at $(\lambda,x)$ be given, and define $v:=\mathcal{G}(y)+u-\mathcal{G}_{x,\delta}(y)$.Then, and

Figures (3)

  • Figure 1: Log-scale norm of gradient v.s. time.
  • Figure 2: Log-scale norm of gradient v.s. No. epochs.
  • Figure 3: Log-scale norm of gradient v.s. No. epochs.

Theorems & Definitions (31)

  • Definition 2.1
  • Definition 2.2
  • Definition 2.3
  • Definition 3.1
  • Definition 3.2
  • Remark 3.1
  • Definition 3.3
  • Definition 3.4
  • Definition 3.5
  • Proposition 3.1
  • ...and 21 more