M-SCAN: A Multistage Framework for Lumbar Spinal Canal Stenosis Grading Using Multi-View Cross Attention
Arnesh Batra, Arush Gumber, Anushk Kumar
TL;DR
The paper tackles the diagnostic burden and variability in grading lumbar spinal canal stenosis from MRI by introducing M-SCAN, a three-stage framework that fuses sagittal and axial views through a sequence-based MultiView Cross Attention architecture. It localizes spinal levels with a U-Net, crops contextual slices, and leverages ROI proposals from YOLOv8 with an EfficientNet backbone; temporal and cross-view interactions are modeled via Bi-GRU and LSTM components to produce level-specific stenosis grades. The approach achieves state-of-the-art performance with AUROC $0.971$ on a large, diverse dataset, demonstrating robustness to real-world variability in histograms, slice counts, and resolutions. This work offers a fully automated, scalable solution poised to improve diagnostic accuracy and workflow efficiency in clinical practice.
Abstract
The increasing prevalence of lumbar spinal canal stenosis has resulted in a surge of MRI (Magnetic Resonance Imaging), leading to labor-intensive interpretation and significant inter-reader variability, even among expert radiologists. This paper introduces a novel and efficient deep-learning framework that fully automates the grading of lumbar spinal canal stenosis. We demonstrate state-of-the-art performance in grading spinal canal stenosis on a dataset of 1,975 unique studies, each containing three distinct types of 3D cross-sectional spine images: Axial T2, Sagittal T1, and Sagittal T2/STIR. Employing a distinctive training strategy, our proposed multistage approach effectively integrates sagittal and axial images. This strategy employs a multi-view model with a sequence-based architecture, optimizing feature extraction and cross-view alignment to achieve an AUROC (Area Under the Receiver Operating Characteristic Curve) of 0.971 in spinal canal stenosis grading surpassing other state-of-the-art methods.
