Table of Contents
Fetching ...

A structured L-BFGS method and its application to inverse problems

Florian Mannel, Hari Om Aggrawal, Jan Modersitzki

TL;DR

It is shown that the new method outperforms other structured L-BFGS methods and classical L-BFGS on non-convex real-life problems from medical image registration and compares favorably with classical L-BFGS on ill-conditioned quadratic model problems.

Abstract

Many inverse problems are phrased as optimization problems in which the objective function is the sum of a data-fidelity term and a regularization. Often, the Hessian of the fidelity term is computationally unavailable while the Hessian of the regularizer allows for cheap matrix-vector products. In this paper, we study an LBFGS method that takes advantage of this structure. We show that the method converges globally without convexity assumptions and that the convergence is linear under a Kurdyka--Łojasiewicz-type inequality. In addition, we prove linear convergence to cluster points near which the objective function is strongly convex. To the best of our knowledge, this is the first time that linear convergence of an LBFGS method is established in a non-convex setting. The convergence analysis is carried out in infinite dimensional Hilbert space, which is appropriate for inverse problems but has not been done before. Numerical results show that the new method outperforms other structured LBFGS methods and classical LBFGS on non-convex real-life problems from medical image registration. It also compares favorably with classical LBFGS on ill-conditioned quadratic model problems. An implementation of the method is freely available.

A structured L-BFGS method and its application to inverse problems

TL;DR

It is shown that the new method outperforms other structured L-BFGS methods and classical L-BFGS on non-convex real-life problems from medical image registration and compares favorably with classical L-BFGS on ill-conditioned quadratic model problems.

Abstract

Many inverse problems are phrased as optimization problems in which the objective function is the sum of a data-fidelity term and a regularization. Often, the Hessian of the fidelity term is computationally unavailable while the Hessian of the regularizer allows for cheap matrix-vector products. In this paper, we study an LBFGS method that takes advantage of this structure. We show that the method converges globally without convexity assumptions and that the convergence is linear under a Kurdyka--Łojasiewicz-type inequality. In addition, we prove linear convergence to cluster points near which the objective function is strongly convex. To the best of our knowledge, this is the first time that linear convergence of an LBFGS method is established in a non-convex setting. The convergence analysis is carried out in infinite dimensional Hilbert space, which is appropriate for inverse problems but has not been done before. Numerical results show that the new method outperforms other structured LBFGS methods and classical LBFGS on non-convex real-life problems from medical image registration. It also compares favorably with classical LBFGS on ill-conditioned quadratic model problems. An implementation of the method is freely available.
Paper Structure (41 sections, 10 theorems, 60 equations, 7 figures, 4 tables)

This paper contains 41 sections, 10 theorems, 60 equations, 7 figures, 4 tables.

Key Result

Lemma 2.2

Let ${\cal J}:{\cal X}\rightarrow\mathbb{R}$ be twice continuously differentiable. Let $x_k,x_{k+1}\in{\cal X}$ be such that $y_k:=\nabla{\cal J}(x_{k+1})-\nabla{\cal J}(x_k)$ and $s_k:=x_{k+1}-x_k$ satisfy $y_k^T s_k>0$. Let be positive semi-definite. Then for $\hat{\tau}_{k+1}^y$ and $\hat{\tau}_{k+1}^s$ from def:tau:H we have where the upper bound requires that $\overline{\nabla^2{\cal J}_k}$

Figures (7)

  • Figure 1: Approximation strategies: Left: BB step sizes for $H_k=\hat{\tau}_k I$; cf. \ref{['lem:tau']}. Right: Scaling factors from \ref{['def:tau:B']} for $B_k=\tau_k I+S_k$; cf. \ref{['lem_relationbetweendifferenttaus']}.
  • Figure 2: Performance profiles comparing Wolfe and Armijo line-search algorithms on IR problems; Armijo yields a significantly lower run-time at the expense of a somewhat lower accuracy.
  • Figure 3: Performance profiles for \ref{['alg_hybrid']} with liner solver PMINRES. PMINRES terminates if the iteration count reaches maxiter or the relative residual is smaller than tol.
  • Figure 4: Performance profiles for different choices of the seed matrix. The first row compares ${\mathrm{Bs}}$, ${\mathrm{Bz}}$, ${\mathrm{Bu}}$, ${\mathrm{Bg}}$ and ${\mathrm{Adap}}$. The second row shows that ${\mathrm{Adap}}$ outperforms the two unstructured state-of-the-art methods ${\mathrm{Hs}}$ and ${\mathrm{Hy}}$.
  • Figure 5: Performance profiles of three structured L-BFGS methods, based on the image registration problems with non-quadratic regularizers. The approach proposed in this paper produces the lowest run-time while yielding the highest accuracy.
  • ...and 2 more figures

Theorems & Definitions (27)

  • Definition 2.1: BB scaling
  • Lemma 2.2
  • Remark 2.3
  • Definition 3.1
  • Lemma 3.2
  • proof
  • Lemma 4.1
  • proof
  • Lemma 4.3
  • proof
  • ...and 17 more