Table of Contents
Fetching ...

Retrospective motion correction in MRI using disentangled embeddings

Qi Wang, Veronika Ecker, Marcel Früh, Sergios Gatidis, Thomas Küstner

TL;DR

This work tackles the problem of motion artifacts in MRI and the limited generalization of existing retrospective corrections across motion types. It introduces a two-stage framework built on a hierarchical, conditional VQVAE with multi-resolution codebooks to learn disentangled motion embeddings, complemented by a PixelSNAIL autoregressive prior conditioned on motion severity to guide correction. The approach enables correction without artifact-specific training and shows qualitative robustness across simulated whole-body motion, highlighting the value of disentangled latent representations for motion translation tasks. If extended to multiple modalities and richer priors, this framework could broadly improve MRI diagnostic quality across various body regions and motion patterns.

Abstract

Physiological motion can affect the diagnostic quality of magnetic resonance imaging (MRI). While various retrospective motion correction methods exist, many struggle to generalize across different motion types and body regions. In particular, machine learning (ML)-based corrections are often tailored to specific applications and datasets. We hypothesize that motion artifacts, though diverse, share underlying patterns that can be disentangled and exploited. To address this, we propose a hierarchical vector-quantized (VQ) variational auto-encoder that learns a disentangled embedding of motion-to-clean image features. A codebook is deployed to capture finite collection of motion patterns at multiple resolutions, enabling coarse-to-fine correction. An auto-regressive model is trained to learn the prior distribution of motion-free images and is used at inference to guide the correction process. Unlike conventional approaches, our method does not require artifact-specific training and can generalize to unseen motion patterns. We demonstrate the approach on simulated whole-body motion artifacts and observe robust correction across varying motion severity. Our results suggest that the model effectively disentangled physical motion of the simulated motion-effective scans, therefore, improving the generalizability of the ML-based MRI motion correction. Our work of disentangling the motion features shed a light on its potential application across anatomical regions and motion types.

Retrospective motion correction in MRI using disentangled embeddings

TL;DR

This work tackles the problem of motion artifacts in MRI and the limited generalization of existing retrospective corrections across motion types. It introduces a two-stage framework built on a hierarchical, conditional VQVAE with multi-resolution codebooks to learn disentangled motion embeddings, complemented by a PixelSNAIL autoregressive prior conditioned on motion severity to guide correction. The approach enables correction without artifact-specific training and shows qualitative robustness across simulated whole-body motion, highlighting the value of disentangled latent representations for motion translation tasks. If extended to multiple modalities and richer priors, this framework could broadly improve MRI diagnostic quality across various body regions and motion patterns.

Abstract

Physiological motion can affect the diagnostic quality of magnetic resonance imaging (MRI). While various retrospective motion correction methods exist, many struggle to generalize across different motion types and body regions. In particular, machine learning (ML)-based corrections are often tailored to specific applications and datasets. We hypothesize that motion artifacts, though diverse, share underlying patterns that can be disentangled and exploited. To address this, we propose a hierarchical vector-quantized (VQ) variational auto-encoder that learns a disentangled embedding of motion-to-clean image features. A codebook is deployed to capture finite collection of motion patterns at multiple resolutions, enabling coarse-to-fine correction. An auto-regressive model is trained to learn the prior distribution of motion-free images and is used at inference to guide the correction process. Unlike conventional approaches, our method does not require artifact-specific training and can generalize to unseen motion patterns. We demonstrate the approach on simulated whole-body motion artifacts and observe robust correction across varying motion severity. Our results suggest that the model effectively disentangled physical motion of the simulated motion-effective scans, therefore, improving the generalizability of the ML-based MRI motion correction. Our work of disentangling the motion features shed a light on its potential application across anatomical regions and motion types.

Paper Structure

This paper contains 9 sections, 3 equations, 3 figures.

Figures (3)

  • Figure 1: Diagram of our conditional auto-encoder during codebook training. The model takes as input a motion-affected image $\hat{x}$ and performs multi-resolution encoding to extract discrete features $e_{1}$ and $e_{2}$. The class conditioning $y$ is passed through a linear layer $h(\cdot)$ and used to condition on both the feature map $D_{1}(e_{1})$ and codebook $e_{2}$ of higher resolutions to generate a motion-corrected clean target image $x$ through the decoding architecture. The conditioning operation hereby refers to feature add-ups.
  • Figure 2: Qualitative results of the proposed model for motion correction at various motion severity levels $y$. From the sagittal whole-body imaging plane (left panel), the abdominal region was selected (yellow dashed box), serving as the motion-free reference. The right panel shows the zoomed-in abdominal views with increasing motion severity level (from top to bottom rows), in the motion-corrected (left) and motion-affected (right) images. The model successfully removes motion artifacts while preserving anatomical structures across different motion severity levels.
  • Figure 3: Qualitative comparison of the motion-corrected results before and after codebook rearranged with conditioning on prior. The top row shows the motion-corrected results without the codebooks rearranged, encircled in yellow box, while the bottom row shows the motion-corrected results decoded from rearranged codebooks, in green box. Motion severity is increased from left to right, being $y=2, y=5$, and $y=8$ respectively. Note that the low frequency motion patterns are further removed in the lower row, presenting cleaner image content.