Spatio-temporal neural distance fields for conditional generative modeling of the heart
Kristine Sørensen, Paula Diez, Jan Margeta, Yasmin El Youssef, Michael Pham, Jonas Jalili Pedersen, Tobias Kühl, Ole de Backer, Klaus Kofoed, Oscar Camara, Rasmus Paulsen
TL;DR
This work tackles the problem of modeling dynamic cardiac anatomy under clinical factors by introducing a conditional generative model based on spatio-temporal neural distance fields. An auto-decoder architecture learns two latent spaces, $\mathbf{z}_c=g_\phi(\mathbf{c})$ for clinical demography and $\mathbf{z}_r$ as an individual residual, enabling a single network to represent multiple LA/LAA geometries and motions via $\hat{d}=f_\theta(\mathbf{z}_c \oplus \mathbf{z}_r \oplus \mathbf{x})$ with $\mathbf{x}=(\mathbf{p},t)$. Training optimizes a clamped L1 loss over a large set of space-time samples, and surfaces are extracted from a $128^3$ grid; the approach is demonstrated on 4D CFA data from 667 participants. The method outperforms the state-of-the-art in sequence completion and enables generation of realistic, demography-conditioned anatomical sequences, along with the ability to infer functional metrics (e.g., $V_{\max}$, $FC$, $CC$) from static images and to synthesize cohorts with specified demographic factors. Overall, this framework offers a versatile, high-fidelity means to model and explore the spatio-temporal dynamics of moving cardiac anatomies for applications in planning, disease analysis, and synthetic population studies.
Abstract
The rhythmic pumping motion of the heart stands as a cornerstone in life, as it circulates blood to the entire human body through a series of carefully timed contractions of the individual chambers. Changes in the size, shape and movement of the chambers can be important markers for cardiac disease and modeling this in relation to clinical demography or disease is therefore of interest. Existing methods for spatio-temporal modeling of the human heart require shape correspondence over time or suffer from large memory requirements, making it difficult to use for complex anatomies. We introduce a novel conditional generative model, where the shape and movement is modeled implicitly in the form of a spatio-temporal neural distance field and conditioned on clinical demography. The model is based on an auto-decoder architecture and aims to disentangle the individual variations from that related to the clinical demography. It is tested on the left atrium (including the left atrial appendage), where it outperforms current state-of-the-art methods for anatomical sequence completion and generates synthetic sequences that realistically mimics the shape and motion of the real left atrium. In practice, this means we can infer functional measurements from a static image, generate synthetic populations with specified demography or disease and investigate how non-imaging clinical data effect the shape and motion of cardiac anatomies.
