Table of Contents
Fetching ...

Novel Limited Memory Quasi-Newton Methods Based On Optimal Matrix Approximation

Erik Berglund, Mikael Johansson

TL;DR

This article proposes a trust region method in which the Hessian approximation, after having been updated by a Broyden class formula and used to solve a trust-region problem, is replaced by one of its closest limited memory approximations.

Abstract

Update formulas for the Hessian approximations in quasi-Newton methods such as BFGS can be derived as analytical solutions to certain nearest-matrix problems. In this article, we propose a similar idea for deriving new limited memory versions of quasi-Newton methods. Most limited memory quasi-Newton methods make use of Hessian approximations that can be written as a scaled identity matrix plus a symmetric matrix with limited rank. We derive a way of finding the nearest matrix of this type to an arbitrary symmetric matrix, in either the Frobenius norm, the induced $l^2$ norm, or a dissimilarity measure for positive definite matrices in terms of trace and determinant. In doing so, we lay down a framework for more general matrix optimization problems with unitarily invariant matrix norms and arbitrary constraints on the set of eigenvalues. We then propose a trust region method in which the Hessian approximation, after having been updated by a Broyden class formula and used to solve a trust-region problem, is replaced by one of its closest limited memory approximations. We propose to store the Hessian approximation in terms of its eigenvectors and eigenvalues in a way that completely defines its eigenvalue decomposition, as this simplifies both the solution of the trust region subproblem and the nearest limited memory matrix problem. Our method is compared to a reference trust region method with the usual limited memory BFGS updates, and is shown to require fewer iterations and the storage of fewer vectors for a variety of test problems.

Novel Limited Memory Quasi-Newton Methods Based On Optimal Matrix Approximation

TL;DR

This article proposes a trust region method in which the Hessian approximation, after having been updated by a Broyden class formula and used to solve a trust-region problem, is replaced by one of its closest limited memory approximations.

Abstract

Update formulas for the Hessian approximations in quasi-Newton methods such as BFGS can be derived as analytical solutions to certain nearest-matrix problems. In this article, we propose a similar idea for deriving new limited memory versions of quasi-Newton methods. Most limited memory quasi-Newton methods make use of Hessian approximations that can be written as a scaled identity matrix plus a symmetric matrix with limited rank. We derive a way of finding the nearest matrix of this type to an arbitrary symmetric matrix, in either the Frobenius norm, the induced norm, or a dissimilarity measure for positive definite matrices in terms of trace and determinant. In doing so, we lay down a framework for more general matrix optimization problems with unitarily invariant matrix norms and arbitrary constraints on the set of eigenvalues. We then propose a trust region method in which the Hessian approximation, after having been updated by a Broyden class formula and used to solve a trust-region problem, is replaced by one of its closest limited memory approximations. We propose to store the Hessian approximation in terms of its eigenvectors and eigenvalues in a way that completely defines its eigenvalue decomposition, as this simplifies both the solution of the trust region subproblem and the nearest limited memory matrix problem. Our method is compared to a reference trust region method with the usual limited memory BFGS updates, and is shown to require fewer iterations and the storage of fewer vectors for a variety of test problems.
Paper Structure (22 sections, 20 theorems, 57 equations, 5 figures, 2 tables, 2 algorithms)

This paper contains 22 sections, 20 theorems, 57 equations, 5 figures, 2 tables, 2 algorithms.

Key Result

Theorem 1

Consider the matrix optimization problem where $\| \cdot \|$ is any unitarily invariant norm, $A$ is a real symmetric matrix and $Eig(X) \in S$ is an arbitrary constraint on the multiset of eigenvalues of $X$. If this problem has at least one optimal solution, then it has an optimal solution $\widehat{X}$ with the same eigenvectors as $A$

Figures (5)

  • Figure 1: Curvature aggregation test with the $l^2$ norm.
  • Figure 2: Curvature aggregation test with the Frobenius norm.
  • Figure 3: Results from the test with logistic regression, when using the best values of $m$ for each algorithm. The stars on the graphs mark at which point each of the algorithms fulfill the convergence condition $\|\nabla f(x_k)\|_2 \leq 10^{-6}$.
  • Figure 4: Results for the randomly generated QPs. The average 10-logarithm of the normalized Euclidean distance to the optimum as a function of iteration number k, for each of the three algorithms, with $\pm 3 \sigma$ confidence intervals.
  • Figure 5: Performance profile for the L2-BFGS, LF-BFGS, MSS and L-BFGS methods.

Theorems & Definitions (34)

  • Theorem 1
  • proof
  • Theorem 2
  • proof
  • Proposition 2.1
  • proof
  • Theorem 3
  • proof
  • lemma 1
  • proof
  • ...and 24 more