Memory-efficient deep end-to-end posterior network (DEEPEN) for inverse problems

Jyothi Rikhab Chand; Mathews Jacob

Memory-efficient deep end-to-end posterior network (DEEPEN) for inverse problems

Jyothi Rikhab Chand, Mathews Jacob

TL;DR

DEEPEN addresses the memory burden of end-to-end MR reconstruction by learning a posterior distribution $p_{\boldsymbol\theta}(\boldsymbol{x}|\boldsymbol{b})$ that combines a data-consistency term with a CNN-based energy prior $E_{\boldsymbol\theta}(\boldsymbol{x})$. Trained via maximum likelihood, using real samples and Langevin-generated fake samples, the framework yields MAP reconstructions and allows posterior sampling to produce uncertainty maps, all without heavy backpropagation through unrolled iterations. Unlike PnP or DEQ approaches that impose Lipschitz constraints, DEEPEN guarantees convergence to a stationary point and supports efficient sampling. Empirical results on four-fold undersampled parallel MR data show DEEPEN achieving performance on par with memory-intensive methods and providing explicit uncertainty quantification, thus enabling scalable MR reconstruction in higher dimensions such as 3D.

Abstract

End-to-End (E2E) unrolled optimization frameworks show promise for Magnetic Resonance (MR) image recovery, but suffer from high memory usage during training. In addition, these deterministic approaches do not offer opportunities for sampling from the posterior distribution. In this paper, we introduce a memory-efficient approach for E2E learning of the posterior distribution. We represent this distribution as the combination of a data-consistency-induced likelihood term and an energy model for the prior, parameterized by a Convolutional Neural Network (CNN). The CNN weights are learned from training data in an E2E fashion using maximum likelihood optimization. The learned model enables the recovery of images from undersampled measurements using the Maximum A Posteriori (MAP) optimization. In addition, the posterior model can be sampled to derive uncertainty maps about the reconstruction. Experiments on parallel MR image reconstruction show that our approach performs comparable to the memory-intensive E2E unrolled algorithm, performs better than its memory-efficient counterpart, and can provide uncertainty maps. Our framework paves the way towards MR image reconstruction in 3D and higher dimensions

Memory-efficient deep end-to-end posterior network (DEEPEN) for inverse problems

TL;DR

DEEPEN addresses the memory burden of end-to-end MR reconstruction by learning a posterior distribution

that combines a data-consistency term with a CNN-based energy prior

. Trained via maximum likelihood, using real samples and Langevin-generated fake samples, the framework yields MAP reconstructions and allows posterior sampling to produce uncertainty maps, all without heavy backpropagation through unrolled iterations. Unlike PnP or DEQ approaches that impose Lipschitz constraints, DEEPEN guarantees convergence to a stationary point and supports efficient sampling. Empirical results on four-fold undersampled parallel MR data show DEEPEN achieving performance on par with memory-intensive methods and providing explicit uncertainty quantification, thus enabling scalable MR reconstruction in higher dimensions such as 3D.

Abstract

Paper Structure (14 sections, 1 theorem, 12 equations, 4 figures, 1 table)

This paper contains 14 sections, 1 theorem, 12 equations, 4 figures, 1 table.

Introduction
Posterior learning
Maximum Likelihood training of the posterior
Generation of fake samples using Markov Chain Monte Carlo (MCMC)
Maximum aposteriori image recovery
Experiments
Data set
Architecture and implementation
Results
Maximum aposteriori estimates
Bayes estimation
Conclusion
Compliance with ethical standards
Acknowledgments

Key Result

Theorem 2.1

Consider the cost function $\mathcal{L}_{\theta}({\boldsymbol{x}})$ in (eq:posterior_modeled), which is bounded below by zeroThe CNN implementation $E_\theta({\boldsymbol{x}})$ has a Rectified Linear Unit (RELU) in the output layer, which makes the lower bound zero.. Then, the steepest descent optim

Figures (4)

Figure 1: Illustration of computation of $E_{\boldsymbol \theta}({\boldsymbol{x}})$ using a two-layer network. Its gradient $\nabla_{\boldsymbol{x}} E_{\boldsymbol \theta}({\boldsymbol{x}})$ is computed using the chain rule. $W_{1}$ and $W_{1}^T$ represents a convolutional and a transposed convolutional layer of appropriate size with shared weights, respectively; $L_{2}$ and $L_{2}^T$ are linear layers with shared weights; $\tau$ is the activation function and $\tau^{'}$ represents its gradient.
Figure 2: Illustration of training procedure of the proposed algorithm. The fake samples ${\boldsymbol{x}}^-$ are generated using Langevin sampling, indicated by the yellow box. The intermediate results are not stored for backpropagation; so, a physical layer is sufficient for forward propagation, thus keeping the memory demand low. The samples ${\boldsymbol{x}}^+$ are obtained from the training data. The training loss involves the energy difference between true and fake samples. Therefore, the training algorithm aims to modify the energy $E_{\boldsymbol \theta}(\cdot)$ so that the generated samples ${\boldsymbol{x}}^-$ match the true samples ${\boldsymbol{x}}^+$.
Figure 3: Comparision of DEEPEN with MoDL and MoL for two different contrasts: (a) T2 and (b) FLAIR. Top row shows the reconstructed image, second row shows the enlarged image, and the third row is the error image.
Figure 4: MMSE, uncertainty and the MAP estimate given by the DEEPEN algorithm on the four-fold undersampled FLAIR image. The MMSE and the uncertainty map was obtained by taking the mean and variance over $100$ samples.

Theorems & Definitions (1)

Theorem 2.1: Wright

Memory-efficient deep end-to-end posterior network (DEEPEN) for inverse problems

TL;DR

Abstract

Memory-efficient deep end-to-end posterior network (DEEPEN) for inverse problems

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (1)