Table of Contents
Fetching ...

L2SR: Learning to Sample and Reconstruct for Accelerated MRI via Reinforcement Learning

Pu Yang, Bin Dong

TL;DR

An alternating training framework for jointly learning a good pair of samplers and reconstructors via deep reinforcement learning is proposed, and a novel sparse-reward partially observed Markov decision process (POMDP) is introduced to formulate the MRI sampling trajectory.

Abstract

Magnetic Resonance Imaging (MRI) is a widely used medical imaging technique, but its long acquisition time can be a limiting factor in clinical settings. To address this issue, researchers have been exploring ways to reduce the acquisition time while maintaining the reconstruction quality. Previous works have focused on finding either sparse samplers with a fixed reconstructor or finding reconstructors with a fixed sampler. However, these approaches do not fully utilize the potential of joint learning of samplers and reconstructors. In this paper, we propose an alternating training framework for jointly learning a good pair of samplers and reconstructors via deep reinforcement learning (RL). In particular, we consider the process of MRI sampling as a sampling trajectory controlled by a sampler, and introduce a novel sparse-reward Partially Observed Markov Decision Process (POMDP) to formulate the MRI sampling trajectory. Compared to the dense-reward POMDP used in existing works, the proposed sparse-reward POMDP is more computationally efficient and has a provable advantage. Moreover, the proposed framework, called L2SR (Learning to Sample and Reconstruct), overcomes the training mismatch problem that arises in previous methods that use dense-reward POMDP. By alternately updating samplers and reconstructors, L2SR learns a pair of samplers and reconstructors that achieve state-of-the-art reconstruction performances on the fastMRI dataset. Codes are available at \url{https://github.com/yangpuPKU/L2SR-Learning-to-Sample-and-Reconstruct}.

L2SR: Learning to Sample and Reconstruct for Accelerated MRI via Reinforcement Learning

TL;DR

An alternating training framework for jointly learning a good pair of samplers and reconstructors via deep reinforcement learning is proposed, and a novel sparse-reward partially observed Markov decision process (POMDP) is introduced to formulate the MRI sampling trajectory.

Abstract

Magnetic Resonance Imaging (MRI) is a widely used medical imaging technique, but its long acquisition time can be a limiting factor in clinical settings. To address this issue, researchers have been exploring ways to reduce the acquisition time while maintaining the reconstruction quality. Previous works have focused on finding either sparse samplers with a fixed reconstructor or finding reconstructors with a fixed sampler. However, these approaches do not fully utilize the potential of joint learning of samplers and reconstructors. In this paper, we propose an alternating training framework for jointly learning a good pair of samplers and reconstructors via deep reinforcement learning (RL). In particular, we consider the process of MRI sampling as a sampling trajectory controlled by a sampler, and introduce a novel sparse-reward Partially Observed Markov Decision Process (POMDP) to formulate the MRI sampling trajectory. Compared to the dense-reward POMDP used in existing works, the proposed sparse-reward POMDP is more computationally efficient and has a provable advantage. Moreover, the proposed framework, called L2SR (Learning to Sample and Reconstruct), overcomes the training mismatch problem that arises in previous methods that use dense-reward POMDP. By alternately updating samplers and reconstructors, L2SR learns a pair of samplers and reconstructors that achieve state-of-the-art reconstruction performances on the fastMRI dataset. Codes are available at \url{https://github.com/yangpuPKU/L2SR-Learning-to-Sample-and-Reconstruct}.
Paper Structure (41 sections, 4 theorems, 59 equations, 8 figures, 6 tables, 2 algorithms)

This paper contains 41 sections, 4 theorems, 59 equations, 8 figures, 6 tables, 2 algorithms.

Key Result

Theorem 1

The optimal values of optimization problems equ:dense-optim-constrain and equ:sparse-optim-constrain satisfy

Figures (8)

  • Figure 1: Diagram of the dense-reward and sparse-reward POMDP.
  • Figure 2: Overview of (a) the training framework of dynamic sampling with fixed reconstructors, and (b) the alternating training framework.
  • Figure 3: Transversal view of the policy model comprising a 2D Inverse Fourier Transform, a feature extractor, and actor and critic neural networks. The learnable parts are enclosed in boxes. The feature extractor is a CNN-based neural network following the architecture from NEURIPS2020_daed2103. The actor and critic nets are both constructed as fully connected neural networks.
  • Figure 4: Histograms of SSIM values as shown in table:sparse-ssim. Each figure contains histograms of six methods: Random, PG-MRI, L2S, LOUPE, $\tau$-Step Seq, L2SR.
  • Figure 5: The choice of the number of alternation. We show SSIM values of L2SR with respect to different rounds of alternating training on the knee test dataset. Specifically, "$L=l.5$" means the $(l+1)$th round of training the sampler with the learned reconstructor, i.e. solving equ:alter-suboptim2.
  • ...and 3 more figures

Theorems & Definitions (11)

  • Theorem 1: joint optimization problem
  • Proof
  • Theorem 2: distributional mismatch
  • Proof
  • Proposition 1
  • Proof
  • Proof
  • Proof
  • Proof
  • Proposition 2
  • ...and 1 more