HeartFormer: Semantic-Aware Dual-Structure Transformers for 3D Four-Chamber Cardiac Point Cloud Reconstruction
Zhengda Ma, Abhirup Banerjee
TL;DR
This work introduces HeartFormer, the first geometric deep learning framework for multi-class 3D four-chamber cardiac reconstruction from cine MRI using point clouds. It combines a Semantic-Aware Dual-Structure Transformer (SA-DSTNet) for coarse semantic geometry with a two-stage Semantic-Aware Geometry-Feature Refinement Transformer (SA-GFRTNet) for progressive, anatomy-guided refinement, supervised by Semantic-Aware Chamfer Distance (SA-CD). A new large-scale HeartCompv1 dataset (17,000 samples) enables robust cross-domain evaluation alongside UK Biobank data, showing HeartFormer consistently outperforms state-of-the-art methods in geometric fidelity and anatomical coherence while using fewer parameters. The approach advances 3D/4D cardiac modeling from limited 2D cine-MRI data, with potential clinical impact in biomarker analysis and patient-specific simulations. Future work includes end-to-end integration with segmentation for fully automated reconstruction from 2D cine-MRI.
Abstract
We present the first geometric deep learning framework based on point cloud representation for 3D four-chamber cardiac reconstruction from cine MRI data. This work addresses a long-standing limitation in conventional cine MRI, which typically provides only 2D slice images of the heart, thereby restricting a comprehensive understanding of cardiac morphology and physiological mechanisms in both healthy and pathological conditions. To overcome this, we propose \textbf{HeartFormer}, a novel point cloud completion network that extends traditional single-class point cloud completion to the multi-class. HeartFormer consists of two key components: a Semantic-Aware Dual-Structure Transformer Network (SA-DSTNet) and a Semantic-Aware Geometry Feature Refinement Transformer Network (SA-GFRTNet). SA-DSTNet generates an initial coarse point cloud with both global geometry features and substructure geometry features. Guided by these semantic-geometry representations, SA-GFRTNet progressively refines the coarse output, effectively leveraging both global and substructure geometric priors to produce high-fidelity and geometrically consistent reconstructions. We further construct \textbf{HeartCompv1}, the first publicly available large-scale dataset with 17,000 high-resolution 3D multi-class cardiac meshes and point-clouds, to establish a general benchmark for this emerging research direction. Extensive cross-domain experiments on HeartCompv1 and UK Biobank demonstrate that HeartFormer achieves robust, accurate, and generalizable performance, consistently surpassing state-of-the-art (SOTA) methods. Code and dataset will be released upon acceptance at: https://github.com/10Darren/HeartFormer.
