Improving Deep Learning-based Automatic Cranial Defect Reconstruction by Heavy Data Augmentation: From Image Registration to Latent Diffusion Models
Marek Wodzinski, Kamil Kwarciak, Mateusz Daniol, Daria Hemmerling
TL;DR
The paper tackles the challenge of generalizing cranial defect reconstruction under limited ground-truth data by benchmarking a broad set of data augmentation strategies. It shows that a heavy augmentation pipeline—combining extreme geometric transforms, deformable image registration, and latent-diffusion-based augmentation on VQVAE representations—substantially improves downstream defect reconstruction performance, with Dice scores exceeding 0.94 on SkullBreak and 0.96 on SkullFix, and strong qualitative generalization to real clinical defects. The best-performing configuration (Geo + IR + LDM-VQVAE) outperforms prior state-of-the-art methods, indicating that diverse synthetic heterogeneity is key for clinical applicability. This approach could enable training purely on synthetic defects while achieving clinically usable reconstructions, potentially reducing design-to-implant times and costs in personalized cranial implants. The work also discusses computational trade-offs, limitations of IR-based augmentation, and future directions toward mesh-based downstream tasks and multi-institution validation.
Abstract
Modeling and manufacturing of personalized cranial implants are important research areas that may decrease the waiting time for patients suffering from cranial damage. The modeling of personalized implants may be partially automated by the use of deep learning-based methods. However, this task suffers from difficulties with generalizability into data from previously unseen distributions that make it difficult to use the research outcomes in real clinical settings. Due to difficulties with acquiring ground-truth annotations, different techniques to improve the heterogeneity of datasets used for training the deep networks have to be considered and introduced. In this work, we present a large-scale study of several augmentation techniques, varying from classical geometric transformations, image registration, variational autoencoders, and generative adversarial networks, to the most recent advances in latent diffusion models. We show that the use of heavy data augmentation significantly increases both the quantitative and qualitative outcomes, resulting in an average Dice Score above 0.94 for the SkullBreak and above 0.96 for the SkullFix datasets. Moreover, we show that the synthetically augmented network successfully reconstructs real clinical defects. The work is a considerable contribution to the field of artificial intelligence in the automatic modeling of personalized cranial implants.
