Table of Contents
Fetching ...

Development and validation of an artificial intelligence model to accurately predict spinopelvic parameters

Edward S. Harake, Joseph R. Linzey, Cheng Jiang, Rushikesh S. Joshi, Mark M. Zaki, Jaes C. Jones, Siri S. Khalsa, John H. Lee, Zachary Wilseck, Jacob R. Joseph, Todd C. Hollon, Paul Park

TL;DR

Adult spinal deformity assessment relies on spinopelvic parameters, but manual measurements are time-consuming and variable. We introduce SpinePose, a three-parallel-CNN system that automatically predicts SVA, PT, PI, SS, LL, T1PA, and L1PA from standing whole-spine X-rays without manual input. On 761 training images and a 40-image test set annotated by experts, SpinePose achieves median errors in the low single-digit ranges with ICCs in the excellent range (0.91–1.0), performing robustly even with instrumentation or transitional anatomy. This approach offers fast, reliable radiographic metrics to aid patient selection and surgical planning, with external validation and broader parameter coverage as future directions.

Abstract

Objective. Achieving appropriate spinopelvic alignment has been shown to be associated with improved clinical symptoms. However, measurement of spinopelvic radiographic parameters is time-intensive and interobserver reliability is a concern. Automated measurement tools have the promise of rapid and consistent measurements, but existing tools are still limited by some degree of manual user-entry requirements. This study presents a novel artificial intelligence (AI) tool called SpinePose that automatically predicts spinopelvic parameters with high accuracy without the need for manual entry. Methods. SpinePose was trained and validated on 761 sagittal whole-spine X-rays to predict sagittal vertical axis (SVA), pelvic tilt (PT), pelvic incidence (PI), sacral slope (SS), lumbar lordosis (LL), T1-pelvic angle (T1PA), and L1-pelvic angle (L1PA). A separate test set of 40 X-rays was labeled by 4 reviewers, including fellowship-trained spine surgeons and a fellowship-trained radiologist with neuroradiology subspecialty certification. Median errors relative to the most senior reviewer were calculated to determine model accuracy on test images. Intraclass correlation coefficients (ICC) were used to assess inter-rater reliability. Results. SpinePose exhibited the following median (interquartile range) parameter errors: SVA: 2.2(2.3)mm, p=0.93; PT: 1.3(1.2)°, p=0.48; SS: 1.7(2.2)°, p=0.64; PI: 2.2(2.1)°, p=0.24; LL: 2.6(4.0)°, p=0.89; T1PA: 1.1(0.9)°, p=0.42; and L1PA: 1.4(1.6)°, p=0.49. Model predictions also exhibited excellent reliability at all parameters (ICC: 0.91-1.0). Conclusions. SpinePose accurately predicted spinopelvic parameters with excellent reliability comparable to fellowship-trained spine surgeons and neuroradiologists. Utilization of predictive AI tools in spinal imaging can substantially aid in patient selection and surgical planning.

Development and validation of an artificial intelligence model to accurately predict spinopelvic parameters

TL;DR

Adult spinal deformity assessment relies on spinopelvic parameters, but manual measurements are time-consuming and variable. We introduce SpinePose, a three-parallel-CNN system that automatically predicts SVA, PT, PI, SS, LL, T1PA, and L1PA from standing whole-spine X-rays without manual input. On 761 training images and a 40-image test set annotated by experts, SpinePose achieves median errors in the low single-digit ranges with ICCs in the excellent range (0.91–1.0), performing robustly even with instrumentation or transitional anatomy. This approach offers fast, reliable radiographic metrics to aid patient selection and surgical planning, with external validation and broader parameter coverage as future directions.

Abstract

Objective. Achieving appropriate spinopelvic alignment has been shown to be associated with improved clinical symptoms. However, measurement of spinopelvic radiographic parameters is time-intensive and interobserver reliability is a concern. Automated measurement tools have the promise of rapid and consistent measurements, but existing tools are still limited by some degree of manual user-entry requirements. This study presents a novel artificial intelligence (AI) tool called SpinePose that automatically predicts spinopelvic parameters with high accuracy without the need for manual entry. Methods. SpinePose was trained and validated on 761 sagittal whole-spine X-rays to predict sagittal vertical axis (SVA), pelvic tilt (PT), pelvic incidence (PI), sacral slope (SS), lumbar lordosis (LL), T1-pelvic angle (T1PA), and L1-pelvic angle (L1PA). A separate test set of 40 X-rays was labeled by 4 reviewers, including fellowship-trained spine surgeons and a fellowship-trained radiologist with neuroradiology subspecialty certification. Median errors relative to the most senior reviewer were calculated to determine model accuracy on test images. Intraclass correlation coefficients (ICC) were used to assess inter-rater reliability. Results. SpinePose exhibited the following median (interquartile range) parameter errors: SVA: 2.2(2.3)mm, p=0.93; PT: 1.3(1.2)°, p=0.48; SS: 1.7(2.2)°, p=0.64; PI: 2.2(2.1)°, p=0.24; LL: 2.6(4.0)°, p=0.89; T1PA: 1.1(0.9)°, p=0.42; and L1PA: 1.4(1.6)°, p=0.49. Model predictions also exhibited excellent reliability at all parameters (ICC: 0.91-1.0). Conclusions. SpinePose accurately predicted spinopelvic parameters with excellent reliability comparable to fellowship-trained spine surgeons and neuroradiologists. Utilization of predictive AI tools in spinal imaging can substantially aid in patient selection and surgical planning.
Paper Structure (15 sections, 5 figures, 3 tables)

This paper contains 15 sections, 5 figures, 3 tables.

Figures (5)

  • Figure 1: SpinePose training pipeline. (A) Standing whole-spine X-rays taken at a single academic institution were searched via an intra-institutional free-text search tool (EMERSE) and subsequently processed at the University of Michigan Radiology IT department. (B) Following image pre-processing, a senior Neurosurgery resident annotated each image with 9 total spinal keypoints at levels C7, T1, L1, S1, and both femoral heads. Bounding boxes were placed around regions L1 and S1. (C) Each input image was fed through 3 parallel convolutional neural networks: L1-model, S1-model, and R (remaining keypoints) model. The L1 and S1 models utilized a two-staged top-down approach facilitated by a region proposal network (RPN), whereas the R model used a bottom-up approach without need for the RPN. Regional detection and classification were all optimized by minimizing their respective losses. (D) The respective outputs of each of the 3 models were combined into 1 aggregate output, and spinopelvic parameters of interest were automatically calculated.
  • Figure 2: Bottom-up vs. top-down training approach. Radar plot comparing accuracy when using a single "bottom-up" model vs. 3 parallel CNNs with a combined approach (SpinePose). The median errors of each training approach were compared relative to the ground truth. SpinePose exhibited lower median errors relative to ground truth annotations at all parameters. L1PA was not included in this analysis because the bottom-up model was not trained to predict this parameter. LL = lumbar lordosis; PI = pelvic incidence; PT = pelvic tilt; SS = sacral slope; SVA= sagittal vertical axis; T1PA = T1 pelvic angle.
  • Figure 3: Percent of keypoint predictions within different distance thresholds. Plots depict keypoint detection accuracy across all spinal landmarks and a range of distance thresholds from ground truth (1-10 mm). *Note: The landmarks at each femoral head were combined into a single femoral midpoint landmark to account for difficulty in distinguishing right vs. left on a sagittal X-ray.
  • Figure 4: Visualization of model vs. ground truth keypoint predictions in 4 patients. Images A and B show a whole-spine and lumbosacral X-ray, respectively, without instrumentation. Images C and D show the same modalities with spinal instrumentation.
  • Figure 5: Intraclass correlation coefficient (ICC) heatmap. An ICC was calculated at each parameter between 2 separate raters. On a scale of 0-1, the ICC reflects inter-rater similarity among scores within a given class. SpinePose (AI) shows excellent reliability at all parameters when compared to ground truth (GT) as well as to each of the 3 remaining raters (R2-R4), who were a fellowship-trained spine surgeon (R2), a neuroradiologist (R3), and a senior neurosurgery resident (R4). L1PA = L1 pelvic angle; LL = lumbar lordosis; PI = pelvic incidence; PT = pelvic tilt; SS = sacral slope; SVA= sagittal vertical axis; T1PA = T1 pelvic angle.