Table of Contents
Fetching ...

Solving Inverse Problems with Model Mismatch using Untrained Neural Networks within Model-based Architectures

Peimeng Guan, Naveed Iqbal, Mark A. Davenport, Mudassir Masood

TL;DR

This work introduces an untrained forward model residual block within the model-based architecture to match the data consistency in the measurement domain for each instance and proposes two variants in well-known model-based architectures (LU and DEQ) that prove convergence under mild conditions and offer a unified solution.

Abstract

Model-based deep learning methods such as loop unrolling (LU) and deep equilibrium model}(DEQ) extensions offer outstanding performance in solving inverse problems (IP). These methods unroll the optimization iterations into a sequence of neural networks that in effect learn a regularization function from data. While these architectures are currently state-of-the-art in numerous applications, their success heavily relies on the accuracy of the forward model. This assumption can be limiting in many physical applications due to model simplifications or uncertainties in the apparatus. To address forward model mismatch, we introduce an untrained forward model residual block within the model-based architecture to match the data consistency in the measurement domain for each instance. We propose two variants in well-known model-based architectures (LU and DEQ) and prove convergence under mild conditions. Our approach offers a unified solution that is less parameter-sensitive, requires no additional data, and enables simultaneous fitting of the forward model and reconstruction in a single pass, benefiting both linear and nonlinear inverse problems. The experiments show significant quality improvement in removing artifacts and preserving details across three distinct applications, encompassing both linear and nonlinear inverse problems. Moreover, we highlight reconstruction effectiveness in intermediate steps and showcase robustness to random initialization of the residual block and a higher number of iterations during evaluation. Code is available at \texttt{https://github.com/InvProbs/A-adaptive-model-based-methods}.

Solving Inverse Problems with Model Mismatch using Untrained Neural Networks within Model-based Architectures

TL;DR

This work introduces an untrained forward model residual block within the model-based architecture to match the data consistency in the measurement domain for each instance and proposes two variants in well-known model-based architectures (LU and DEQ) that prove convergence under mild conditions and offer a unified solution.

Abstract

Model-based deep learning methods such as loop unrolling (LU) and deep equilibrium model}(DEQ) extensions offer outstanding performance in solving inverse problems (IP). These methods unroll the optimization iterations into a sequence of neural networks that in effect learn a regularization function from data. While these architectures are currently state-of-the-art in numerous applications, their success heavily relies on the accuracy of the forward model. This assumption can be limiting in many physical applications due to model simplifications or uncertainties in the apparatus. To address forward model mismatch, we introduce an untrained forward model residual block within the model-based architecture to match the data consistency in the measurement domain for each instance. We propose two variants in well-known model-based architectures (LU and DEQ) and prove convergence under mild conditions. Our approach offers a unified solution that is less parameter-sensitive, requires no additional data, and enables simultaneous fitting of the forward model and reconstruction in a single pass, benefiting both linear and nonlinear inverse problems. The experiments show significant quality improvement in removing artifacts and preserving details across three distinct applications, encompassing both linear and nonlinear inverse problems. Moreover, we highlight reconstruction effectiveness in intermediate steps and showcase robustness to random initialization of the residual block and a higher number of iterations during evaluation. Code is available at \texttt{https://github.com/InvProbs/A-adaptive-model-based-methods}.
Paper Structure (23 sections, 1 theorem, 15 equations, 9 figures, 3 tables)

This paper contains 23 sections, 1 theorem, 15 equations, 9 figures, 3 tables.

Key Result

Proposition 5.1

Assume $r$ is convex, and use the fact that $||\bm{x} -\bm{z}||_2^2$ is $L-$smooth with respect to $\bm{x}$, in another word let $f( \bm{x}) = ||\bm{x} -\bm{z}||_2^2$, $|| \nabla f(\bm{x}_1) - \nabla f(\bm{x}_2)||_2 \leq L ||\bm{x}_1 - \bm{x}_2||_2$ for some $L > 0$. The algorithm in (eq:luhqs_updat

Figures (9)

  • Figure 1: A proximal LU network is trained for a deblurring task using a single forward model. The top row shows the intermediate reconstructions over 8 iterations using the true model, while the bottom row shows the evaluation results when a small perturbation is added to the forward model (the Peak Signal-to-Noise Ratio of the true kernel to the noisy kernel is 40.9 dB). This quality degradation is due to the accumulation of errors in the forward model.
  • Figure 2: Illustration of the $k^{th}$ iteration of an $\mathcal{A}$-adaptive LU network. $\bm{x}_0$ is fed into the network, the auxiliary update and the correction update corresponding to updates in $\theta$ and $\bm{z}$ respectively, and the proximal network in green is updated using end-to-end training. The final output contains the parameters for estimating the function mismatch and the reconstruction estimate.
  • Figure 3: Comparing the deblurring results using robust LU, $\mathcal{A}$-adaptive LU and $\mathcal{A}$-adaptive DEQ to the ground truth, where $\bm{x}_0$ is the initial blurry images and $\bm{x}$ is the ground truth. The bottom row shows the zoomed regions in red boxes. The proposed methods generate sharper edges.
  • Figure 4: Comparing the deconvolution results using robust LU, $\mathcal{A}$-adaptive LU and $\mathcal{A}$-adaptive DEQ to the ground truth. The bottom row shows the zoomed regions in red boxes. A seismic layer in the red box is missing in reconstruction using the baseline robust LU method.
  • Figure 5: Comparing the defogging results using robust LU, $\mathcal{A}$-adaptive LU and $\mathcal{A}$-adaptive DEQ to the ground truth. The bottom row shows the zoomed regions in red boxes. The proposed methods reproduce cleaner images.
  • ...and 4 more figures

Theorems & Definitions (2)

  • Proposition 5.1
  • proof