Soft quasi-Newton: Guaranteed positive definiteness by relaxing the secant constraint

Erik Berglund; Jiaojiao Zhang; Mikael Johansson

Soft quasi-Newton: Guaranteed positive definiteness by relaxing the secant constraint

Erik Berglund, Jiaojiao Zhang, Mikael Johansson

TL;DR

An algorithm is proposed that exhibits linear convergence toward a neighborhood of the optimal solution even when gradient and function evaluations are subject to bounded perturbations and consistently outperforms state-of-the-art methods across a range of scenarios.

Abstract

We propose a novel algorithm, termed soft quasi-Newton (soft QN), for optimization in the presence of bounded noise. Traditional quasi-Newton algorithms are vulnerable to such perturbations. To develop a more robust quasi-Newton method, we replace the secant condition in the matrix optimization problem for the Hessian update with a penalty term in its objective and derive a closed-form update formula. A key feature of our approach is its ability to maintain positive definiteness of the Hessian inverse approximation. Furthermore, we establish the following properties of soft QN: it recovers the BFGS method under specific limits, it treats positive and negative curvature equally, and it is scale invariant. Collectively, these features enhance the efficacy of soft QN in noisy environments. For strongly convex objective functions and Hessian approximations obtained using soft QN, we develop an algorithm that exhibits linear convergence toward a neighborhood of the optimal solution, even if gradient and function evaluations are subject to bounded perturbations. Through numerical experiments, we demonstrate superior performance of soft QN compared to state-of-the-art methods in various scenarios.

Soft quasi-Newton: Guaranteed positive definiteness by relaxing the secant constraint

TL;DR

Abstract

Paper Structure (18 sections, 6 theorems, 40 equations, 4 figures, 3 tables, 2 algorithms)

This paper contains 18 sections, 6 theorems, 40 equations, 4 figures, 3 tables, 2 algorithms.

Introduction
Preliminaries: quasi-Newton methods
The soft QN method
A relaxed matrix optimization problem and its solution
Properties of the soft QN update
Soft QN always generates descent directions
Soft QN recovers BFGS in the limit
Soft QN treats positive and negative curvature equally
Soft QN is scale invariant
Comparison with SP-BFGS
Convergence
Bounding the Hessian inverse approximation
Convergence guarantees
Numerical results
Logistic regression
...and 3 more sections

Key Result

Theorem 3.1

For every $\alpha_k>0$ and every $H_k \succ 0$, there exists a unique positive definite solution $B^{\star}$ to eqn:soft_QN_problem with the function $\upsilon$ defined in eqn:penalty_term. Letting $H_{k+1}=(B^{\star})^{-1}$ leads to the recursive update where

Figures (4)

Figure 1: The soft QN method detects a diagonal direction of low curvature and takes a large step along it. The saddle-free Newton method is unable to perceive the low curvature in that direction and moves further toward the saddle point before changing its course.
Figure 2: Logistic regression experiments. The 10-logarithm of the gradient norm is plotted against iteration number. Solid lines indicate the mean over all instances in the Monte Carlo simulation, while shaded areas represent three standard deviation confidence intervals.
Figure 3: Experiments on quadratic problems. The 10-logarithm of the normalized suboptimality is plotted against iteration number. Solid lines show the mean over all instances in the Monte Carlo simulation, shaded areas represent three standard deviation confidence intervals.
Figure 4: Comparision of soft QN and SP-BFGS on the CUTEst-problem DIXMAANA. Interval of suboptimality for different test runs, plotted on a logarithmic scale against number of function evaluations.

Theorems & Definitions (7)

Theorem 3.1
Theorem 3.2
Proposition 3.3
Lemma 4.1
Theorem 4.2
Theorem 4.3
Remark 1

Soft quasi-Newton: Guaranteed positive definiteness by relaxing the secant constraint

TL;DR

Abstract

Soft quasi-Newton: Guaranteed positive definiteness by relaxing the secant constraint

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (7)