Table of Contents
Fetching ...

M-SCAN: A Multistage Framework for Lumbar Spinal Canal Stenosis Grading Using Multi-View Cross Attention

Arnesh Batra, Arush Gumber, Anushk Kumar

TL;DR

The paper tackles the diagnostic burden and variability in grading lumbar spinal canal stenosis from MRI by introducing M-SCAN, a three-stage framework that fuses sagittal and axial views through a sequence-based MultiView Cross Attention architecture. It localizes spinal levels with a U-Net, crops contextual slices, and leverages ROI proposals from YOLOv8 with an EfficientNet backbone; temporal and cross-view interactions are modeled via Bi-GRU and LSTM components to produce level-specific stenosis grades. The approach achieves state-of-the-art performance with AUROC $0.971$ on a large, diverse dataset, demonstrating robustness to real-world variability in histograms, slice counts, and resolutions. This work offers a fully automated, scalable solution poised to improve diagnostic accuracy and workflow efficiency in clinical practice.

Abstract

The increasing prevalence of lumbar spinal canal stenosis has resulted in a surge of MRI (Magnetic Resonance Imaging), leading to labor-intensive interpretation and significant inter-reader variability, even among expert radiologists. This paper introduces a novel and efficient deep-learning framework that fully automates the grading of lumbar spinal canal stenosis. We demonstrate state-of-the-art performance in grading spinal canal stenosis on a dataset of 1,975 unique studies, each containing three distinct types of 3D cross-sectional spine images: Axial T2, Sagittal T1, and Sagittal T2/STIR. Employing a distinctive training strategy, our proposed multistage approach effectively integrates sagittal and axial images. This strategy employs a multi-view model with a sequence-based architecture, optimizing feature extraction and cross-view alignment to achieve an AUROC (Area Under the Receiver Operating Characteristic Curve) of 0.971 in spinal canal stenosis grading surpassing other state-of-the-art methods.

M-SCAN: A Multistage Framework for Lumbar Spinal Canal Stenosis Grading Using Multi-View Cross Attention

TL;DR

The paper tackles the diagnostic burden and variability in grading lumbar spinal canal stenosis from MRI by introducing M-SCAN, a three-stage framework that fuses sagittal and axial views through a sequence-based MultiView Cross Attention architecture. It localizes spinal levels with a U-Net, crops contextual slices, and leverages ROI proposals from YOLOv8 with an EfficientNet backbone; temporal and cross-view interactions are modeled via Bi-GRU and LSTM components to produce level-specific stenosis grades. The approach achieves state-of-the-art performance with AUROC on a large, diverse dataset, demonstrating robustness to real-world variability in histograms, slice counts, and resolutions. This work offers a fully automated, scalable solution poised to improve diagnostic accuracy and workflow efficiency in clinical practice.

Abstract

The increasing prevalence of lumbar spinal canal stenosis has resulted in a surge of MRI (Magnetic Resonance Imaging), leading to labor-intensive interpretation and significant inter-reader variability, even among expert radiologists. This paper introduces a novel and efficient deep-learning framework that fully automates the grading of lumbar spinal canal stenosis. We demonstrate state-of-the-art performance in grading spinal canal stenosis on a dataset of 1,975 unique studies, each containing three distinct types of 3D cross-sectional spine images: Axial T2, Sagittal T1, and Sagittal T2/STIR. Employing a distinctive training strategy, our proposed multistage approach effectively integrates sagittal and axial images. This strategy employs a multi-view model with a sequence-based architecture, optimizing feature extraction and cross-view alignment to achieve an AUROC (Area Under the Receiver Operating Characteristic Curve) of 0.971 in spinal canal stenosis grading surpassing other state-of-the-art methods.

Paper Structure

This paper contains 8 sections, 1 equation, 3 figures, 1 table.

Figures (3)

  • Figure 1: Sample of dataset showing the 3 types of grade present in the dataset, which can be diagnosed using sagittal and axial images.
  • Figure 2: The above figure shows our multi-stage framework, giving a brief overview of our process to diagnose spinal canal stenosis using multiple angles of the spine.
  • Figure 3: The above illustration shows the architecture of our MultiView attention-based network, which includes an image encoder based on the EfficientNet trained in stage two along with our sequence model comprising of LSTM, GRU and attention layers with five classification heads for output.