Efficient nonlocal linear image denoising: Bilevel optimization with Nonequispaced Fast Fourier Transform and matrix-free preconditioning

Andrés Miniguano-Trujillo; John W. Pearson; Benjamin D. Goddard

Efficient nonlocal linear image denoising: Bilevel optimization with Nonequispaced Fast Fourier Transform and matrix-free preconditioning

Andrés Miniguano-Trujillo, John W. Pearson, Benjamin D. Goddard

TL;DR

The paper develops a fast, storage-efficient framework for nonlocal image denoising via bilevel optimization using an unnormalized extended Gaussian ANOVA kernel. It integrates Nonequispaced Fast Fourier Transform (NFFT) for fast kernel summations, and employs matrix-free Krylov solvers with a novel change-of-basis (deflation) technique to isolate the smallest eigenvalue, complemented by diagonal and dense preconditioners. The authors provide theoretical spectral results for graph-Laplacian-like operators, plus extensive numerical experiments on large-scale images and parameter-learning tasks, demonstrating near-constant iteration counts and substantial speedups. The work enables solving large, dense, and ill-conditioned nonlocal systems with low memory footprints, and includes open-source code for reproducibility and broader application to similar inverse problems.

Abstract

We present a new approach for nonlocal image denoising, based around the application of an unnormalized extended Gaussian ANOVA kernel within a bilevel optimization algorithm. A critical bottleneck when solving such problems for finely-resolved images is the solution of huge-scale, dense linear systems arising from the minimization of an energy term. We tackle this using a Krylov subspace approach, with a Nonequispaced Fast Fourier Transform utilized to approximate matrix-vector products in a matrix-free manner. We accelerate the algorithm using a novel change of basis approach to account for the (known) smallest eigenvalue-eigenvector pair of the matrices involved, coupled with a simple but frequently very effective diagonal preconditioning approach. We present a number of theoretical results concerning the eigenvalues and predicted convergence behavior, and a range of numerical experiments which validate our solvers and use them to tackle parameter learning problems. These demonstrate that very large problems may be effectively and rapidly denoised with very low storage requirements on a computer.

Efficient nonlocal linear image denoising: Bilevel optimization with Nonequispaced Fast Fourier Transform and matrix-free preconditioning

TL;DR

Abstract

Paper Structure (26 sections, 12 theorems, 97 equations, 8 figures, 3 tables)

This paper contains 26 sections, 12 theorems, 97 equations, 8 figures, 3 tables.

Introduction
Notation and concepts
Preliminaries on variational denoising and control
Variational denoising
Preliminaries on nonlocal calculus
Nonlocal denoising
Parameter learning via bilevel optimization
The choice of weights
Discretization
NFFT--based Fast Gauss Transform
Preconditioning under a new basis
Change of basis
An explicit transformation
Unitary representation
Application to graph Laplacians
...and 11 more sections

Key Result

Proposition 2.1

\newlabelprop:All_The_Nice_Properties_of_NL_Energy0 Let $\gamma \in L^\infty (\Omega\times\Omega)$, then the functional $R:L^2(\Omega) \to \mathbb{R}$ is differentiable and convex in $L^2(\Omega)$, and its derivative is continuous and self--adjoint.

Figures (8)

Figure 1: Representation of a neighborhood of features for a point $\mathbf{x} \in \Omega$.
Figure 1: Evolution of eigenvalues and condition number as $\sigma$ varies in the interval $[10,1.4 \times 10^3]$ and with $\lambda = 10^{-9}$. All the horizontal axes are displayed in logarithmic scale. Each ordered quantity is connected by a dotted line as $\sigma$ increases, suggesting a continuous effect of $\sigma$. Panel a displays each point of $\Sigma(A)$ corresponding to a fixed value of $\sigma$. A plum horizontal solid line at $\lambda$ highlights the smallest eigenvalue. The dashed red line at the top represents the upper bound $\lambda + \mu n$. Panel b represents $\Sigma( \mathsf{P}_{\text{a}}^{-1} A)$ as $\sigma$ evolves. The vertical axis features a scaling around 1 for improved visualization. Panel c showcases the evolution of the condition number of $A$ and its preconditioned variants. The upper and lower bounds in dashed purple lines represent $1$ and $1 + \lambda^{-1} \mu n$. Panel d represents $\Sigma( \pi_2 ( \mathsf{D}^{-1}_{a,X} \mathsf{A}_{X} ) )$ as $\sigma$ evolves. The vertical axis features a scaling around 1.
Figure 1: Images used for numerical tests.
Figure 2: Comparative display of the eigenvalues of $A$, $\mathsf{P}_{\text{a}}^{-1} A$, and $\mathsf{P}_{\text{b}}^{-1} A$ with $\lambda = 10^{-9}$. The horizontal axis quantifies each eigenvalue, while the vertical alignment is non--quantitative and solely for visual separation. Top row (black $\ocircle$ markers):$\Sigma(A)$ is represented. The grey dashed lines on the right are the bounds for $\rho(A)$ as derived in \ref{['r:general_spectral_bounds']}. The bounds for $a(A)$, as per \ref{['lemma:UpperConnectivity', 'lem:lower_bound_algebraic_connectivity']}, are depicted with teal dashed lines. A plum vertical solid line at $\lambda$ highlights the smallest eigenvalue. Middle row (orange $\vartriangleright$ markers):$\Sigma( \mathsf{P}_{\text{a}}^{-1} A)$ is represented. The upper and lower bounds on this set based on \ref{['lem:Rayleigh_Diag_Jac']} are included in dotted red lines. Bottom row (yellow $\vartriangleright$ markers):$\Sigma( \mathsf{P}_{\text{b}}^{-1} A)$ is represented. The bounds from \ref{['lem:Prec_Norm_2_Diag']} are omitted for visual clarity.
Figure 2: Comparative display of the number of CG iterations for different regularization values $\lambda \in \Lambda$ and each choice of preconditioner. Here $\mathsf{M} \in \{ \mathsf{I}_{n}, \mathsf{P}_{\text{a}}, \mathsf{P}_{\text{b}}\}$ was used for the system $A\mathbf{u} = \lambda \mathbf{f}$, while $\mathsf{M} \in \{\mathsf{I}_U, \mathcal{P}_{U,\mathsf a}, \mathcal{P}_{U,\mathsf b} \}$ was used for the decoupled version of the system $\mathsf{A}_{U} \mathbf{x} = \lambda U^\top \mathbf{f}$. The strong black and red dashed lines correspond to the unpreconditioned cases for both bases, while the other four dashed lines correspond to the preconditioned cases, respectively. If a test failed, the lines are extrapolated to a larger quantity in the vertical axes which is not depicted to indicate that the required number of iterations grows, for that method takes a greater number of iterations than the maximum allowed. The teal dashed vertical line represents the approximation of the algebraic connectivity $a(B) \approx \frac{n}{n-1} \mu \min \pmb{\eta}$.
...and 3 more figures

Theorems & Definitions (24)

Proposition 2.1
Theorem 2.2
Theorem 2.3
Remark 2.4
Lemma 4.1
Proof 1
Theorem 4.2
Lemma 4.3
Proof 2
Lemma 4.4
...and 14 more

Efficient nonlocal linear image denoising: Bilevel optimization with Nonequispaced Fast Fourier Transform and matrix-free preconditioning

TL;DR

Abstract

Efficient nonlocal linear image denoising: Bilevel optimization with Nonequispaced Fast Fourier Transform and matrix-free preconditioning

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (8)

Theorems & Definitions (24)