Table of Contents
Fetching ...

Spatio-temporal neural distance fields for conditional generative modeling of the heart

Kristine Sørensen, Paula Diez, Jan Margeta, Yasmin El Youssef, Michael Pham, Jonas Jalili Pedersen, Tobias Kühl, Ole de Backer, Klaus Kofoed, Oscar Camara, Rasmus Paulsen

TL;DR

This work tackles the problem of modeling dynamic cardiac anatomy under clinical factors by introducing a conditional generative model based on spatio-temporal neural distance fields. An auto-decoder architecture learns two latent spaces, $\mathbf{z}_c=g_\phi(\mathbf{c})$ for clinical demography and $\mathbf{z}_r$ as an individual residual, enabling a single network to represent multiple LA/LAA geometries and motions via $\hat{d}=f_\theta(\mathbf{z}_c \oplus \mathbf{z}_r \oplus \mathbf{x})$ with $\mathbf{x}=(\mathbf{p},t)$. Training optimizes a clamped L1 loss over a large set of space-time samples, and surfaces are extracted from a $128^3$ grid; the approach is demonstrated on 4D CFA data from 667 participants. The method outperforms the state-of-the-art in sequence completion and enables generation of realistic, demography-conditioned anatomical sequences, along with the ability to infer functional metrics (e.g., $V_{\max}$, $FC$, $CC$) from static images and to synthesize cohorts with specified demographic factors. Overall, this framework offers a versatile, high-fidelity means to model and explore the spatio-temporal dynamics of moving cardiac anatomies for applications in planning, disease analysis, and synthetic population studies.

Abstract

The rhythmic pumping motion of the heart stands as a cornerstone in life, as it circulates blood to the entire human body through a series of carefully timed contractions of the individual chambers. Changes in the size, shape and movement of the chambers can be important markers for cardiac disease and modeling this in relation to clinical demography or disease is therefore of interest. Existing methods for spatio-temporal modeling of the human heart require shape correspondence over time or suffer from large memory requirements, making it difficult to use for complex anatomies. We introduce a novel conditional generative model, where the shape and movement is modeled implicitly in the form of a spatio-temporal neural distance field and conditioned on clinical demography. The model is based on an auto-decoder architecture and aims to disentangle the individual variations from that related to the clinical demography. It is tested on the left atrium (including the left atrial appendage), where it outperforms current state-of-the-art methods for anatomical sequence completion and generates synthetic sequences that realistically mimics the shape and motion of the real left atrium. In practice, this means we can infer functional measurements from a static image, generate synthetic populations with specified demography or disease and investigate how non-imaging clinical data effect the shape and motion of cardiac anatomies.

Spatio-temporal neural distance fields for conditional generative modeling of the heart

TL;DR

This work tackles the problem of modeling dynamic cardiac anatomy under clinical factors by introducing a conditional generative model based on spatio-temporal neural distance fields. An auto-decoder architecture learns two latent spaces, for clinical demography and as an individual residual, enabling a single network to represent multiple LA/LAA geometries and motions via with . Training optimizes a clamped L1 loss over a large set of space-time samples, and surfaces are extracted from a grid; the approach is demonstrated on 4D CFA data from 667 participants. The method outperforms the state-of-the-art in sequence completion and enables generation of realistic, demography-conditioned anatomical sequences, along with the ability to infer functional metrics (e.g., , , ) from static images and to synthesize cohorts with specified demographic factors. Overall, this framework offers a versatile, high-fidelity means to model and explore the spatio-temporal dynamics of moving cardiac anatomies for applications in planning, disease analysis, and synthetic population studies.

Abstract

The rhythmic pumping motion of the heart stands as a cornerstone in life, as it circulates blood to the entire human body through a series of carefully timed contractions of the individual chambers. Changes in the size, shape and movement of the chambers can be important markers for cardiac disease and modeling this in relation to clinical demography or disease is therefore of interest. Existing methods for spatio-temporal modeling of the human heart require shape correspondence over time or suffer from large memory requirements, making it difficult to use for complex anatomies. We introduce a novel conditional generative model, where the shape and movement is modeled implicitly in the form of a spatio-temporal neural distance field and conditioned on clinical demography. The model is based on an auto-decoder architecture and aims to disentangle the individual variations from that related to the clinical demography. It is tested on the left atrium (including the left atrial appendage), where it outperforms current state-of-the-art methods for anatomical sequence completion and generates synthetic sequences that realistically mimics the shape and motion of the real left atrium. In practice, this means we can infer functional measurements from a static image, generate synthetic populations with specified demography or disease and investigate how non-imaging clinical data effect the shape and motion of cardiac anatomies.
Paper Structure (6 sections, 3 equations, 4 figures, 1 table)

This paper contains 6 sections, 3 equations, 4 figures, 1 table.

Figures (4)

  • Figure 1: The signed distance field approximates the surface as the decision boundary separating space-time coordinates ($\textbf{p}_k,t_k$) that are inside and outside the surface. For each $\textbf{p}_k,t_k$ the signed distance $\hat{d}$ to the surface is predicted with the network $f_\theta$ based on a concatenation ($\oplus$) of the clinical demography latent vector $\textbf{z}_c$, the individual latent vector $\textbf{z}_r$ and the coordinate. $\textbf{z}_c$ is embedded from the clinical demography encoder $g_\phi$, whereas the source of $\textbf{z}_r$ depends on the task. Training: Each training sample is assigned a learnable embedding $\textbf{z}_r$ which is optimized jointly with $g_\phi$ and $f_\theta$. Reconstruction: The individual embedding is learned by locking the parameters of $f_\theta$ and $g_\phi$ and optimize for $\textbf{z}_r$. Generation: A new $\textbf{z}_r$ is generated by sampling from a multivariate Gaussian distribution.
  • Figure 2: Two examples of completed sequences illustrated with volume curves and the chamfer distance (CD) between the predicted and true surface at time frame $0\%$,$35\%$ and $75\%$ (coloured surfaces) as well as surface with maximum volume (wireframe). The blue ($\square$) correspond to the $25th$ percentile evaluated on average CD, whereas the red ($\bigcirc$) shows an abnormal atrial motion without an active emptying phase.
  • Figure 3: Distribution of left atrial fractional change (FC) across the subgroups based on gender (male/female) and age ($<50$,50-59,60-69 and $>69$) in the population from the test set (left) and a synthetic population generated with the same clinical demography as the population in the test set (right).
  • Figure 4: Synthetically created samples. Left: fixed clinical demography (50-59 years old, Male, systolic blood pressure equal to 130 mmHg) and sampled $\textbf{z}_r$. Right: fixed individual latent and varying $\textbf{z}_c$. The figure show the first two principle components (PC) of the latent spaces, the volume curves for all samples as well as the generated anatomies at $t=\%0$ (surface) and at $V_\text{max}$ (wireframe).