A structured L-BFGS method and its application to inverse problems

Florian Mannel; Hari Om Aggrawal; Jan Modersitzki

A structured L-BFGS method and its application to inverse problems

Florian Mannel, Hari Om Aggrawal, Jan Modersitzki

TL;DR

It is shown that the new method outperforms other structured L-BFGS methods and classical L-BFGS on non-convex real-life problems from medical image registration and compares favorably with classical L-BFGS on ill-conditioned quadratic model problems.

Abstract

Many inverse problems are phrased as optimization problems in which the objective function is the sum of a data-fidelity term and a regularization. Often, the Hessian of the fidelity term is computationally unavailable while the Hessian of the regularizer allows for cheap matrix-vector products. In this paper, we study an LBFGS method that takes advantage of this structure. We show that the method converges globally without convexity assumptions and that the convergence is linear under a Kurdyka--Łojasiewicz-type inequality. In addition, we prove linear convergence to cluster points near which the objective function is strongly convex. To the best of our knowledge, this is the first time that linear convergence of an LBFGS method is established in a non-convex setting. The convergence analysis is carried out in infinite dimensional Hilbert space, which is appropriate for inverse problems but has not been done before. Numerical results show that the new method outperforms other structured LBFGS methods and classical LBFGS on non-convex real-life problems from medical image registration. It also compares favorably with classical LBFGS on ill-conditioned quadratic model problems. An implementation of the method is freely available.

A structured L-BFGS method and its application to inverse problems

TL;DR

Abstract

Paper Structure (41 sections, 10 theorems, 60 equations, 7 figures, 4 tables)

This paper contains 41 sections, 10 theorems, 60 equations, 7 figures, 4 tables.

Introduction
Topic and main contributions of the paper
Structured seed matrices for structured problems
Costs and benefits of a structured seed matrix
The choice of the scaling factor in the structured L-BFGS method
Main convergence results for the structured L-BFGS method
Related work
Code availability
Organization and notation
The classical L-BFGS method
Step size selection
Choice of the scaling factor hat tau_k
The structured L-BFGS method
Choice of the scaling factor tau_k
The two-loop recursion
...and 26 more sections

Key Result

Lemma 2.2

Let ${\cal J}:{\cal X}\rightarrow\mathbb{R}$ be twice continuously differentiable. Let $x_k,x_{k+1}\in{\cal X}$ be such that $y_k:=\nabla{\cal J}(x_{k+1})-\nabla{\cal J}(x_k)$ and $s_k:=x_{k+1}-x_k$ satisfy $y_k^T s_k>0$. Let be positive semi-definite. Then for $\hat{\tau}_{k+1}^y$ and $\hat{\tau}_{k+1}^s$ from def:tau:H we have where the upper bound requires that $\overline{\nabla^2{\cal J}_k}$

Figures (7)

Figure 1: Approximation strategies: Left: BB step sizes for $H_k=\hat{\tau}_k I$; cf. \ref{['lem:tau']}. Right: Scaling factors from \ref{['def:tau:B']} for $B_k=\tau_k I+S_k$; cf. \ref{['lem_relationbetweendifferenttaus']}.
Figure 2: Performance profiles comparing Wolfe and Armijo line-search algorithms on IR problems; Armijo yields a significantly lower run-time at the expense of a somewhat lower accuracy.
Figure 3: Performance profiles for \ref{['alg_hybrid']} with liner solver PMINRES. PMINRES terminates if the iteration count reaches maxiter or the relative residual is smaller than tol.
Figure 4: Performance profiles for different choices of the seed matrix. The first row compares ${\mathrm{Bs}}$, ${\mathrm{Bz}}$, ${\mathrm{Bu}}$, ${\mathrm{Bg}}$ and ${\mathrm{Adap}}$. The second row shows that ${\mathrm{Adap}}$ outperforms the two unstructured state-of-the-art methods ${\mathrm{Hs}}$ and ${\mathrm{Hy}}$.
Figure 5: Performance profiles of three structured L-BFGS methods, based on the image registration problems with non-quadratic regularizers. The approach proposed in this paper produces the lowest run-time while yielding the highest accuracy.
...and 2 more figures

Theorems & Definitions (27)

Definition 2.1: BB scaling
Lemma 2.2
Remark 2.3
Definition 3.1
Lemma 3.2
proof
Lemma 4.1
proof
Lemma 4.3
proof
...and 17 more

A structured L-BFGS method and its application to inverse problems

TL;DR

Abstract

A structured L-BFGS method and its application to inverse problems

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (7)

Theorems & Definitions (27)