Table of Contents
Fetching ...

Random Expert Sampling for Deep Learning Segmentation of Acute Ischemic Stroke on Non-contrast CT

Sophie Ostmeier, Brian Axelrod, Benjamin Pulli, Benjamin F. J. Verhaaren, Abdelkader Mahammedi, Yongkai Liu, Christian Federau, Greg Zaharchuk, Jeremy J. Heit

TL;DR

This study addresses automatic delineation of the ischemic core on non-contrast CT to aid acute stroke triage. It introduces random expert sampling as a training scheme for a benchmark U-Net, trained on three neuroradiologists’ annotations from the DEFUSE 3 cohort and compared to majority voting and inter-expert Agreement. Random expert sampling achieves higher agreement with experts than they have among themselves and yields ischemic core volumes that correlate with final infarct volumes and clinical outcomes, performing comparably to CT perfusion in some respects. The approach has potential to enable accurate, reliable NCCT-based triage in less specialized hospitals, potentially expanding access to endovascular therapy by reducing dependence on perfusion imaging, with methodological support provided by bootstrap and cross-validation analyses and supplementary theoretical derivations for the training loss.

Abstract

Purpose: Multi-expert deep learning training methods to automatically quantify ischemic brain tissue on Non-Contrast CT Materials and Methods: The data set consisted of 260 Non-Contrast CTs from 233 patients of acute ischemic stroke patients recruited in the DEFUSE 3 trial. A benchmark U-Net was trained on the reference annotations of three experienced neuroradiologists to segment ischemic brain tissue using majority vote and random expert sampling training schemes. We used a one-sided Wilcoxon signed-rank test on a set of segmentation metrics to compare bootstrapped point estimates of the training schemes with the inter-expert agreement and ratio of variance for consistency analysis. We further compare volumes with the 24h-follow-up DWI (final infarct core) in the patient subgroup with full reperfusion and we test volumes for correlation to the clinical outcome (mRS after 30 and 90 days) with the Spearman method. Results: Random expert sampling leads to a model that shows better agreement with experts than experts agree among themselves and better agreement than the agreement between experts and a majority-vote model performance (Surface Dice at Tolerance 5mm improvement of 61% to 0.70 +- 0.03 and Dice improvement of 25% to 0.50 +- 0.04). The model-based predicted volume similarly estimated the final infarct volume and correlated better to the clinical outcome than CT perfusion. Conclusion: A model trained on random expert sampling can identify the presence and location of acute ischemic brain tissue on Non-Contrast CT similar to CT perfusion and with better consistency than experts. This may further secure the selection of patients eligible for endovascular treatment in less specialized hospitals.

Random Expert Sampling for Deep Learning Segmentation of Acute Ischemic Stroke on Non-contrast CT

TL;DR

This study addresses automatic delineation of the ischemic core on non-contrast CT to aid acute stroke triage. It introduces random expert sampling as a training scheme for a benchmark U-Net, trained on three neuroradiologists’ annotations from the DEFUSE 3 cohort and compared to majority voting and inter-expert Agreement. Random expert sampling achieves higher agreement with experts than they have among themselves and yields ischemic core volumes that correlate with final infarct volumes and clinical outcomes, performing comparably to CT perfusion in some respects. The approach has potential to enable accurate, reliable NCCT-based triage in less specialized hospitals, potentially expanding access to endovascular therapy by reducing dependence on perfusion imaging, with methodological support provided by bootstrap and cross-validation analyses and supplementary theoretical derivations for the training loss.

Abstract

Purpose: Multi-expert deep learning training methods to automatically quantify ischemic brain tissue on Non-Contrast CT Materials and Methods: The data set consisted of 260 Non-Contrast CTs from 233 patients of acute ischemic stroke patients recruited in the DEFUSE 3 trial. A benchmark U-Net was trained on the reference annotations of three experienced neuroradiologists to segment ischemic brain tissue using majority vote and random expert sampling training schemes. We used a one-sided Wilcoxon signed-rank test on a set of segmentation metrics to compare bootstrapped point estimates of the training schemes with the inter-expert agreement and ratio of variance for consistency analysis. We further compare volumes with the 24h-follow-up DWI (final infarct core) in the patient subgroup with full reperfusion and we test volumes for correlation to the clinical outcome (mRS after 30 and 90 days) with the Spearman method. Results: Random expert sampling leads to a model that shows better agreement with experts than experts agree among themselves and better agreement than the agreement between experts and a majority-vote model performance (Surface Dice at Tolerance 5mm improvement of 61% to 0.70 +- 0.03 and Dice improvement of 25% to 0.50 +- 0.04). The model-based predicted volume similarly estimated the final infarct volume and correlated better to the clinical outcome than CT perfusion. Conclusion: A model trained on random expert sampling can identify the presence and location of acute ischemic brain tissue on Non-Contrast CT similar to CT perfusion and with better consistency than experts. This may further secure the selection of patients eligible for endovascular treatment in less specialized hospitals.
Paper Structure (26 sections, 1 equation, 8 figures, 6 tables)

This paper contains 26 sections, 1 equation, 8 figures, 6 tables.

Figures (8)

  • Figure 1: Flowchart of the data partition. 233 patients with their initial NCCT were randomly split into 5 folds of training and test sets. 25 patients had multiple NCCT. Those were only assign to the initial NCCT when in the training set. The external generalization cohort included 35 patients.
  • Figure 2: Training scheme pipeline with sampling strategy for random expert sampling and majority vote
  • Figure 3: First two model were trained on majority vote and random expert sampling. Second, the median agreement per case for inter-expert agreement, model-expert agreement for the prediction of majority vote and random expert sampling was the basis to compare random expert sampling to the majority vote and inter-expert agreement.
  • Figure 4: Bland-Altman for Random Expert Sampling (blue) and Majority Vote Model Volume (green) estimates compare to Median Expert Volume.
  • Figure 5: Bland-Altman for Random Expert Sampling Model Volume (blue) and CTP Ischemic Core Volume <30% (red) compared to 24h DWI-Volume for all reperfusors (TICI$\geq$2B).
  • ...and 3 more figures