Aneumo: A Large-Scale Multimodal Aneurysm Dataset with Computational Fluid Dynamics Simulations and Deep Learning Benchmarks
Xigui Li, Yuanye Zhou, Feiyang Xiao, Xin Guo, Chen Jiang, Tan Pan, Xingmeng Zhang, Cenyu Liu, Zeyun Miao, Jianchao Ge, Xiansheng Wang, Qimeng Wang, Yichi Zhang, Wenbo Zhang, Fengping Zhu, Limei Han, Yuan Qi, Chensen Lin, Yuan Cheng
TL;DR
Aneumo tackles the lack of large-scale, multimodal data for intracranial aneurysm hemodynamics by constructing 427 real geometries deformed into 10,660 synthetic aneurysm models and pairing them with eight steady-state CFD simulations to produce 85,280 velocity and pressure fields, plus segmentation masks. The dataset leverages high-fidelity CFD with OpenFOAM on scalable HPC to create a comprehensive, multimodal resource (geometry, masks, meshes, and field data) intended to accelerate data-driven hemodynamic modeling and surrogate simulations. The authors also introduce a benchmark for estimating flow parameters using SciML architectures, showing that a Swin Transformer–based geometric encoder integrated into DeepONet (DeepONet-SwinT) achieves higher accuracy and robustness on unseen geometries and flow conditions than a standard DeepONet. Collectively, Aneumo enables rapid, data-driven analysis and surrogate modeling for aneurysm risk assessment, potentially informing real-time clinical decision-making and personalized treatment planning.
Abstract
Intracranial aneurysms (IAs) are serious cerebrovascular lesions found in approximately 5\% of the general population. Their rupture may lead to high mortality. Current methods for assessing IA risk focus on morphological and patient-specific factors, but the hemodynamic influences on IA development and rupture remain unclear. While accurate for hemodynamic studies, conventional computational fluid dynamics (CFD) methods are computationally intensive, hindering their deployment in large-scale or real-time clinical applications. To address this challenge, we curated a large-scale, high-fidelity aneurysm CFD dataset to facilitate the development of efficient machine learning algorithms for such applications. Based on 427 real aneurysm geometries, we synthesized 10,660 3D shapes via controlled deformation to simulate aneurysm evolution. The authenticity of these synthetic shapes was confirmed by neurosurgeons. CFD computations were performed on each shape under eight steady-state mass flow conditions, generating a total of 85,280 blood flow dynamics data covering key parameters. Furthermore, the dataset includes segmentation masks, which can support tasks that use images, point clouds or other multimodal data as input. Additionally, we introduced a benchmark for estimating flow parameters to assess current modeling methods. This dataset aims to advance aneurysm research and promote data-driven approaches in biofluids, biomedical engineering, and clinical risk assessment. The code and dataset are available at: https://github.com/Xigui-Li/Aneumo.
