Table of Contents
Fetching ...

A novel open-source ultrasound dataset with deep learning benchmarks for spinal cord injury localization and anatomical segmentation

Avisha Kumar, Kunal Kotkar, Kelly Jiang, Meghana Bhimreddy, Daniel Davidar, Carly Weber-Levine, Siddharth Krishnan, Max J. Kerensky, Ruixing Liang, Kelley Kempski Leadingham, Denis Routkevitch, Andrew M. Hersh, Kimberly Ashayeri, Betty Tyler, Ian Suk, Jennifer Son, Nicholas Theodore, Nitish Thakor, Amir Manbachi

TL;DR

This is the largest annotated dataset of spinal cord ultrasound images made publicly available to researchers and medical professionals, as well as the first public report of object detection and segmentation architectures to assess anatomical markers in the spinal cord for methodology development and clinical applications.

Abstract

While deep learning has catalyzed breakthroughs across numerous domains, its broader adoption in clinical settings is inhibited by the costly and time-intensive nature of data acquisition and annotation. To further facilitate medical machine learning, we present an ultrasound dataset of 10,223 Brightness-mode (B-mode) images consisting of sagittal slices of porcine spinal cords (N=25) before and after a contusion injury. We additionally benchmark the performance metrics of several state-of-the-art object detection algorithms to localize the site of injury and semantic segmentation models to label the anatomy for comparison and creation of task-specific architectures. Finally, we evaluate the zero-shot generalization capabilities of the segmentation models on human ultrasound spinal cord images to determine whether training on our porcine dataset is sufficient for accurately interpreting human data. Our results show that the YOLOv8 detection model outperforms all evaluated models for injury localization, achieving a mean Average Precision (mAP50-95) score of 0.606. Segmentation metrics indicate that the DeepLabv3 segmentation model achieves the highest accuracy on unseen porcine anatomy, with a Mean Dice score of 0.587, while SAMed achieves the highest Mean Dice score generalizing to human anatomy (0.445). To the best of our knowledge, this is the largest annotated dataset of spinal cord ultrasound images made publicly available to researchers and medical professionals, as well as the first public report of object detection and segmentation architectures to assess anatomical markers in the spinal cord for methodology development and clinical applications.

A novel open-source ultrasound dataset with deep learning benchmarks for spinal cord injury localization and anatomical segmentation

TL;DR

This is the largest annotated dataset of spinal cord ultrasound images made publicly available to researchers and medical professionals, as well as the first public report of object detection and segmentation architectures to assess anatomical markers in the spinal cord for methodology development and clinical applications.

Abstract

While deep learning has catalyzed breakthroughs across numerous domains, its broader adoption in clinical settings is inhibited by the costly and time-intensive nature of data acquisition and annotation. To further facilitate medical machine learning, we present an ultrasound dataset of 10,223 Brightness-mode (B-mode) images consisting of sagittal slices of porcine spinal cords (N=25) before and after a contusion injury. We additionally benchmark the performance metrics of several state-of-the-art object detection algorithms to localize the site of injury and semantic segmentation models to label the anatomy for comparison and creation of task-specific architectures. Finally, we evaluate the zero-shot generalization capabilities of the segmentation models on human ultrasound spinal cord images to determine whether training on our porcine dataset is sufficient for accurately interpreting human data. Our results show that the YOLOv8 detection model outperforms all evaluated models for injury localization, achieving a mean Average Precision (mAP50-95) score of 0.606. Segmentation metrics indicate that the DeepLabv3 segmentation model achieves the highest accuracy on unseen porcine anatomy, with a Mean Dice score of 0.587, while SAMed achieves the highest Mean Dice score generalizing to human anatomy (0.445). To the best of our knowledge, this is the largest annotated dataset of spinal cord ultrasound images made publicly available to researchers and medical professionals, as well as the first public report of object detection and segmentation architectures to assess anatomical markers in the spinal cord for methodology development and clinical applications.
Paper Structure (21 sections, 1 equation, 6 figures, 5 tables)

This paper contains 21 sections, 1 equation, 6 figures, 5 tables.

Figures (6)

  • Figure 1: Data collection of porcine spinal cord. (a) An aerial view of the female Yorkshire pig with a laminectomy to expose the spinal cord. (b) The spinous processes and lamina of the 4th to 6th thoracic vertebrae (T4-T6) are removed to provide an acoustic window and an injury is induced with a weight drop. (c) An i22LH8 transducer connected to Canon Aplio i800 ultrasound system is placed above the spinal cord to capture Brightness-mode (B-mode) images of the region of interest. (d) The resulting Digital Imaging and Communications in Medicine (DICOM) image is displayed on a personal computer, showcasing the dura, cerebrospinal fluid (CSF), pia, spinal cord, and the injury location (hematoma). The collected images are included in the final dataset for real-time injury localization and semantic segmentation.
  • Figure 2: Curation of the porcine and human ultrasound spinal cord dataset. (a) Sample pre-injury images from the porcine spinal cord dataset. These images included the primary anatomy of interest and did not suffer from severe noise or artifacts that would render the image difficult to interpret. (b) Sample post-injury images of the dataset that fulfilled the same inclusion criteria as the pre-injury images. The red bounding boxes indicate the hematoma. (c) Sample images that were excluded from the final dataset due to shadowing, noise, or artifacts that affect image quality and occlude the anatomy of interest. (d) Sample human spinal cord images that were used to test the generalizability of the segmentation models.
  • Figure 3: Example images and their corresponding ground truth masks. (a) A typical spinal cord image displaying clear delineation between the anatomical boundaries. The bottom of the image corresponds to the ventral spinal cord. (b) A spinal cord image in which the induced injury caused swelling in the tissue, effectively blurring the delineation between the dura, cerebrospinal fluid (CSF), and the pia. To avoid mistakes in interpretation and inconsistencies in labelling, the region is annotated as the dura/pia complex. (c) In some images, noise and other intraoperative artifacts resulted in ambiguous delineation between the ventral dura and the ventral region. For these types of images, we label that anatomy as the dura/ventral complex.
  • Figure 4: Visualization of the semantic segmentation models' performance overlaid on example porcine images.
  • Figure 5: A comparison of the current standard of care for spinal cord injury with our approach using ultrasound imaging and deep learning for automatic diagnostics. Current treatment approaches do not provide a comprehensive and continuous avenue for monitoring patient spinal cord health after surgical decompression. With automatic injury localization and spinal cord segmentation, clinicians can capture changing biomarkers such as hematoma progression and tissue inflammation to personalize and better understand treatment approach.
  • ...and 1 more figures