Incremental Quasi-Newton Methods with Faster Superlinear Convergence Rates

Zhuanghua Liu; Luo Luo; Bryan Kian Hsiang Low

Incremental Quasi-Newton Methods with Faster Superlinear Convergence Rates

Zhuanghua Liu, Luo Luo, Bryan Kian Hsiang Low

TL;DR

This work tackles large-scale finite-sum convex optimization by introducing Lazy Incremental Symmetric Rank-1 (LISR-1) and its block extension LISR-$k$, which apply symmetric rank-1 updates in an incremental setting to estimate curvature. The algorithms maintain low per-iteration cost through lazy updates and a cyclic update scheme, achieving a condition-number-free local superlinear convergence rate, namely $\mathcal{O}((1-d^{-1})^{\lceil t/n\rceil^2})$ for LISR-1 and $\mathcal{O}((1-k/d)^{\lceil t/n\rceil^2})$ for LISR-$k$, with $O(1)$ gradient/Hessian-vector oracle calls and $O(d^2)$ flops per iteration. Theoretical analysis leverages a Hessian-approximation metric $\nu(\cdot,\cdot)$ under Broyden-family updates and greedy SR1 updates to bound convergence and approximation errors. Empirical results on quadratic programs and regularized logistic regression show that the proposed methods outperform baseline IQN variants, validating both the theoretical rates and practical efficiency, with the block variant offering further acceleration at modest additional cost.

Abstract

We consider the finite-sum optimization problem, where each component function is strongly convex and has Lipschitz continuous gradient and Hessian. The recently proposed incremental quasi-Newton method is based on BFGS update and achieves a local superlinear convergence rate that is dependent on the condition number of the problem. This paper proposes a more efficient quasi-Newton method by incorporating the symmetric rank-1 update into the incremental framework, which results in the condition-number-free local superlinear convergence rate. Furthermore, we can boost our method by applying the block update on the Hessian approximation, which leads to an even faster local convergence rate. The numerical experiments show the proposed methods significantly outperform the baseline methods.

Incremental Quasi-Newton Methods with Faster Superlinear Convergence Rates

TL;DR

This work tackles large-scale finite-sum convex optimization by introducing Lazy Incremental Symmetric Rank-1 (LISR-1) and its block extension LISR-

, which apply symmetric rank-1 updates in an incremental setting to estimate curvature. The algorithms maintain low per-iteration cost through lazy updates and a cyclic update scheme, achieving a condition-number-free local superlinear convergence rate, namely

for LISR-1 and

for LISR-

, with

gradient/Hessian-vector oracle calls and

flops per iteration. Theoretical analysis leverages a Hessian-approximation metric

under Broyden-family updates and greedy SR1 updates to bound convergence and approximation errors. Empirical results on quadratic programs and regularized logistic regression show that the proposed methods outperform baseline IQN variants, validating both the theoretical rates and practical efficiency, with the block variant offering further acceleration at modest additional cost.

Abstract

Paper Structure (35 sections, 16 theorems, 103 equations, 5 figures, 2 tables, 4 algorithms)

This paper contains 35 sections, 16 theorems, 103 equations, 5 figures, 2 tables, 4 algorithms.

Introduction
Paper Organization
Related Work
Classical Quasi-Newton Methods
Block Quasi-Newton Methods
Stochastic/Incremental Quasi-Newton Methods
Preliminaries
Notations
Assumptions
Broyden Family Update
Methodology
The Algorithm
Convergence Analysis
Discussion
Extension to Block Quasi-Newton Methods
...and 20 more sections

Key Result

Lemma 4.1

The iteration formula (cur_iter_update) satisfies for all $t \geq 1$, where $\Gamma^t \coloneqq \|(\sum_{i=1}^n B_i^t)^{-1}\|$.

Figures (5)

Figure 1: Normalized error vs. the number of effective passes for the quadratic programming problem.
Figure 2: Normalized error vs. the number of effective passes for the regularized logistic regression problem on several real-world datasets .
Figure 3: Comparison of the proposed methods with baselines for the quadratic function minimization problem.
Figure 4: Comparison of the LISR-$k$ method with different choices of $k$ for the general function minimization.
Figure 5: Comparison of the LISR-$k$ method with different choices of $k$ for the general function minimization.

Theorems & Definitions (29)

Definition 3.3
Definition 3.4
Remark 3.5
Lemma 4.1
Remark 4.2
Lemma 4.3
Remark 4.4
Lemma 4.5
Theorem 4.6
Theorem 5.1
...and 19 more

Incremental Quasi-Newton Methods with Faster Superlinear Convergence Rates

TL;DR

Abstract

Incremental Quasi-Newton Methods with Faster Superlinear Convergence Rates

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (5)

Theorems & Definitions (29)