Table of Contents
Fetching ...

End-to-End 4D Heart Mesh Recovery Across Full-Stack and Sparse Cardiac MRI

Yihong Chen, Jiancheng Yang, Deniz Sayin Mercadier, Hieu Le, Juerg Schwitter, Pascal Fua

TL;DR

The paper tackles reconstructing 4D cardiac motion from both full-stack CMR data and sparse intra-procedural slices. It introduces TetHeart, an end-to-end framework built on deformable tetrahedra in a shared space, featuring an Attentive 2D-3D Feature Assembler (AFA), a full-to-sparse distillation strategy, and a two-stage weakly supervised motion learning scheme that uses only keyframes such as end-diastole (ED) and end-systole (ES). The method achieves state-of-the-art accuracy on public datasets and demonstrates strong generalization to interventional and private datasets without retraining, with real-time inference on limited slices (e.g., 12 FPS). This work enables robust online 3D heart tracking during interventions and lays the groundwork for patient-specific digital twins and adaptive image-guided procedures in a clinical setting.

Abstract

Reconstructing cardiac motion from CMR sequences is critical for diagnosis, prognosis, and intervention. Existing methods rely on complete CMR stacks to infer full heart motion, limiting their applicability during intervention when only sparse observations are available. We present TetHeart, the first end-to-end framework for unified 4D heart mesh recovery from both offline full-stack and intra-procedural sparse-slice observations. Our method leverages deformable tetrahedra to capture shape and motion in a coherent space shared across cardiac structures. Before a procedure, it initializes detailed, patient-specific heart meshes from high-quality full stacks, which can then be updated using whatever slices can be obtained in real-time, down to a single one during the procedure. TetHeart incorporates several key innovations: (i) an attentive slice-adaptive 2D-3D feature assembly mechanism that integrates information from arbitrary numbers of slices at any position; (ii) a distillation strategy to ensure accurate reconstruction under extreme sparsity; and (iii) a weakly supervised motion learning scheme requiring annotations only at keyframes, such as the end-diastolic and end-systolic phases. Trained and validated on three large public datasets and evaluated zero-shot on additional private interventional and public datasets without retraining, TetHeart achieves state-of-the-art accuracy and strong generalization in both pre- and intra-procedural settings.

End-to-End 4D Heart Mesh Recovery Across Full-Stack and Sparse Cardiac MRI

TL;DR

The paper tackles reconstructing 4D cardiac motion from both full-stack CMR data and sparse intra-procedural slices. It introduces TetHeart, an end-to-end framework built on deformable tetrahedra in a shared space, featuring an Attentive 2D-3D Feature Assembler (AFA), a full-to-sparse distillation strategy, and a two-stage weakly supervised motion learning scheme that uses only keyframes such as end-diastole (ED) and end-systole (ES). The method achieves state-of-the-art accuracy on public datasets and demonstrates strong generalization to interventional and private datasets without retraining, with real-time inference on limited slices (e.g., 12 FPS). This work enables robust online 3D heart tracking during interventions and lays the groundwork for patient-specific digital twins and adaptive image-guided procedures in a clinical setting.

Abstract

Reconstructing cardiac motion from CMR sequences is critical for diagnosis, prognosis, and intervention. Existing methods rely on complete CMR stacks to infer full heart motion, limiting their applicability during intervention when only sparse observations are available. We present TetHeart, the first end-to-end framework for unified 4D heart mesh recovery from both offline full-stack and intra-procedural sparse-slice observations. Our method leverages deformable tetrahedra to capture shape and motion in a coherent space shared across cardiac structures. Before a procedure, it initializes detailed, patient-specific heart meshes from high-quality full stacks, which can then be updated using whatever slices can be obtained in real-time, down to a single one during the procedure. TetHeart incorporates several key innovations: (i) an attentive slice-adaptive 2D-3D feature assembly mechanism that integrates information from arbitrary numbers of slices at any position; (ii) a distillation strategy to ensure accurate reconstruction under extreme sparsity; and (iii) a weakly supervised motion learning scheme requiring annotations only at keyframes, such as the end-diastolic and end-systolic phases. Trained and validated on three large public datasets and evaluated zero-shot on additional private interventional and public datasets without retraining, TetHeart achieves state-of-the-art accuracy and strong generalization in both pre- and intra-procedural settings.

Paper Structure

This paper contains 28 sections, 12 equations, 8 figures, 8 tables.

Figures (8)

  • Figure 1: Difference between the standard offline scenario and the intervention room setting for cardiac motion reconstruction.
  • Figure 2: TetHeart reconstructs cardiac motion from observed 2D slices. At $t=0$, a 3D CMR stack $\mathbf{I}^0$ is processed by the AFA module with self-attention to extract a volumetric static feature $V_{\text{static}}^0$, which is then used to generate the initial tetrahedra $\mathcal{G}^0$. For each subsequent time step $t$, the observed 2D slices $\mathbf{O}^t$ are encoded via the AFA module with cross-attention to obtain a volumetric motion feature $V_{\text{motion}}^t$, which is then used to recover motion by predicting offsets that deform $\mathcal{G}^0$. Weights are shared between the static and dynamic branch.
  • Figure 3: 1-Slice Comparison. (Left) We predict meshes by deforming from ED to ES frame. Color indicates the magnitude of point-to-surface error. (Right) Segmentation results on slices at the apex and base location. Note 4DMR can only output motion for myocardium.
  • Figure 4: Motion sequence prediction results using 1-/Full-slices by deforming from ED frame to the target frame. Corresponding Volume-frame curve is given on the right. #1: NOR, #2: DCM.
  • Figure 5: Chamfer distance as a function of the number of slices used to capture the motion.
  • ...and 3 more figures