Table of Contents
Fetching ...

Synthetic Skull CT Generation with Generative Adversarial Networks to Train Deep Learning Models for Clinical Transcranial Ultrasound

Kasra Naftchi-Ardebili, Karanpartap Singh, Reza Pourabolghasem, Pejman Ghanouni, Gerald R. Popelka, Kim Butts Pauly

TL;DR

The proposed generative adversarial network SkullGAN makes it possible for researchers to generate large numbers of synthetic skull CT segments, necessary for training neural networks for medical applications involving the human skull, mitigating challenges with access, privacy, capital, time, and the need for domain expertise.

Abstract

Deep learning offers potential for various healthcare applications, yet requires extensive datasets of curated medical images where data privacy, cost, and distribution mismatch across various acquisition centers could become major problems. To overcome these challenges, we propose a generative adversarial network (SkullGAN) to create large datasets of synthetic skull CT slices, geared towards training models for transcranial ultrasound. With wide ranging applications in treatment of essential tremor, Parkinson's, and Alzheimer's disease, transcranial ultrasound clinical pipelines can be significantly optimized via integration of deep learning. The main roadblock is the lack of sufficient skull CT slices for the purposes of training, which SkullGAN aims to address. Actual CT slices of 38 healthy subjects were used for training. The generated synthetic skull images were then evaluated based on skull density ratio, mean thickness, and mean intensity. Their fidelity was further analyzed using t-distributed stochastic neighbor embedding (t-SNE), Fréchet inception distance (FID) score, and visual Turing test (VTT) taken by four staff clinical radiologists. SkullGAN-generated images demonstrated similar quantitative radiological features to real skulls. t-SNE failed to separate real and synthetic samples from one another, and the FID score was 49. Expert radiologists achieved a 60\% mean accuracy on the VTT. SkullGAN makes it possible for researchers to generate large numbers of synthetic skull CT segments, necessary for training neural networks for medical applications involving the human skull, such as transcranial focused ultrasound, mitigating challenges with access, privacy, capital, time, and the need for domain expertise.

Synthetic Skull CT Generation with Generative Adversarial Networks to Train Deep Learning Models for Clinical Transcranial Ultrasound

TL;DR

The proposed generative adversarial network SkullGAN makes it possible for researchers to generate large numbers of synthetic skull CT segments, necessary for training neural networks for medical applications involving the human skull, mitigating challenges with access, privacy, capital, time, and the need for domain expertise.

Abstract

Deep learning offers potential for various healthcare applications, yet requires extensive datasets of curated medical images where data privacy, cost, and distribution mismatch across various acquisition centers could become major problems. To overcome these challenges, we propose a generative adversarial network (SkullGAN) to create large datasets of synthetic skull CT slices, geared towards training models for transcranial ultrasound. With wide ranging applications in treatment of essential tremor, Parkinson's, and Alzheimer's disease, transcranial ultrasound clinical pipelines can be significantly optimized via integration of deep learning. The main roadblock is the lack of sufficient skull CT slices for the purposes of training, which SkullGAN aims to address. Actual CT slices of 38 healthy subjects were used for training. The generated synthetic skull images were then evaluated based on skull density ratio, mean thickness, and mean intensity. Their fidelity was further analyzed using t-distributed stochastic neighbor embedding (t-SNE), Fréchet inception distance (FID) score, and visual Turing test (VTT) taken by four staff clinical radiologists. SkullGAN-generated images demonstrated similar quantitative radiological features to real skulls. t-SNE failed to separate real and synthetic samples from one another, and the FID score was 49. Expert radiologists achieved a 60\% mean accuracy on the VTT. SkullGAN makes it possible for researchers to generate large numbers of synthetic skull CT segments, necessary for training neural networks for medical applications involving the human skull, such as transcranial focused ultrasound, mitigating challenges with access, privacy, capital, time, and the need for domain expertise.
Paper Structure (28 sections, 8 equations, 8 figures, 2 tables)

This paper contains 28 sections, 8 equations, 8 figures, 2 tables.

Figures (8)

  • Figure 1: SkullGAN generator and training pipeline. SkullGAN was first pre-trained on the Celeb-A dataset, and then trained on human skull CTs. In contrast to random initialization of the weights for training on the human skull CTs, pre-training yielded layers with fine-tuned weights for detecting edges and resulted in better quality skull segment images, with finer definition both in contour and interior bone structure.
  • Figure 2: SkullGAN training set preparation. After segmentation, the slices were masked and rotated where necessary. To account for both the left and the right temporal bones, two segments were taken from each axial slice. This resulted in a training set of 2,414 2D horizontal skull segments.
  • Figure 3: Five cropped examples from each dataset. a. Training Set: real skull CT segments. b. Synthetic Set: skull CT segments generated by SkullGAN. c. Artificial Set: idealized and unrealistic fake skull CT segments deliberately engineered to look unlike any real skull segments and yet fool quantitative radiological assessment metrics.
  • Figure 4: Violin plots of the quantitative radiological metrics for the training, synthetic, and artificial sets. What is noteworthy is that we can engineer artificial skulls that are ostensibly unrealistic and still match the training set (real skull CTs) across the three radiological metrics of skull density ratio, mean intensity, and mean thickness. In fact, we can go as far as matching the shapes of the distributions: the bimodal distribution of mean intensity for the artificial set resembles that of the training set.
  • Figure 5: Separability of the datasets. a. Visual representation of t-SNE applied to the radiological features shown in Figure \ref{['distributions']}. No discernible clustering is seen, and the datasets appear inseparable with this method. b. Visual representation of t-SNE applied to the unrolled skulls, where each image is represented as a vector of size 16,384. The artificial set is clearly separated into clusters by t-SNE (one for the unrealistic models and another for the idealized models), while the other distributions remain inseparable.
  • ...and 3 more figures