Table of Contents
Fetching ...

Depth-Sequence Transformer (DST) for Segment-Specific ICA Calcification Mapping on Non-Contrast CT

Xiangjian Hou, Ebru Yaman Akcicek, Xin Wang, Kazem Hashemizadeh, Scott Mcnally, Chun Yuan, Xiaodong Ma

TL;DR

This work targets segment-specific ICAC quantification on NCCT, arguing that total ICAC volume hides location-dependent risks. It introduces the Depth-Sequence Transformer (DST), which linearizes 3D localization into a 1D axial sequence processed by a CNN–Transformer to predict $N=6$ independent landmark distributions, aided by a deterministic hemisphere separation as anatomical prior. DST achieves high localization accuracy on a 100-patient cohort (MAE about 0.13 slices with ~96% within ±1 slice) and demonstrates strong generality on the Clean-CC-CCII benchmark with competitive classification performance and favorable efficiency. The approach provides a practical tool for segment-specific ICAC analysis and establishes a versatile 3D backbone suitable for broader volumetric medical imaging tasks, with potential to inform diagnosis, prognosis, and procedural planning.

Abstract

While total intracranial carotid artery calcification (ICAC) volume is an established stroke biomarker, growing evidence shows this aggregate metric ignores the critical influence of plaque location, since calcification in different segments carries distinct prognostic and procedural risks. However, a finer-grained, segment-specific quantification has remained technically infeasible. Conventional 3D models are forced to process downsampled volumes or isolated patches, sacrificing the global context required to resolve anatomical ambiguity and render reliable landmark localization. To overcome this, we reformulate the 3D challenge as a \textbf{Parallel Probabilistic Landmark Localization} task along the 1D axial dimension. We propose the \textbf{Depth-Sequence Transformer (DST)}, a framework that processes full-resolution CT volumes as sequences of 2D slices, learning to predict $N=6$ independent probability distributions that pinpoint key anatomical landmarks. Our DST framework demonstrates exceptional accuracy and robustness. Evaluated on a 100-patient clinical cohort with rigorous 5-fold cross-validation, it achieves a Mean Absolute Error (MAE) of \textbf{0.1 slices}, with \textbf{96\%} of predictions falling within a $\pm1$ slice tolerance. Furthermore, to validate its architectural power, the DST backbone establishes the best result on the public Clean-CC-CCII classification benchmark under an end-to-end evaluation protocol. Our work delivers the first practical tool for automated segment-specific ICAC analysis. The proposed framework provides a foundation for further studies on the role of location-specific biomarkers in diagnosis, prognosis, and procedural planning.

Depth-Sequence Transformer (DST) for Segment-Specific ICA Calcification Mapping on Non-Contrast CT

TL;DR

This work targets segment-specific ICAC quantification on NCCT, arguing that total ICAC volume hides location-dependent risks. It introduces the Depth-Sequence Transformer (DST), which linearizes 3D localization into a 1D axial sequence processed by a CNN–Transformer to predict independent landmark distributions, aided by a deterministic hemisphere separation as anatomical prior. DST achieves high localization accuracy on a 100-patient cohort (MAE about 0.13 slices with ~96% within ±1 slice) and demonstrates strong generality on the Clean-CC-CCII benchmark with competitive classification performance and favorable efficiency. The approach provides a practical tool for segment-specific ICAC analysis and establishes a versatile 3D backbone suitable for broader volumetric medical imaging tasks, with potential to inform diagnosis, prognosis, and procedural planning.

Abstract

While total intracranial carotid artery calcification (ICAC) volume is an established stroke biomarker, growing evidence shows this aggregate metric ignores the critical influence of plaque location, since calcification in different segments carries distinct prognostic and procedural risks. However, a finer-grained, segment-specific quantification has remained technically infeasible. Conventional 3D models are forced to process downsampled volumes or isolated patches, sacrificing the global context required to resolve anatomical ambiguity and render reliable landmark localization. To overcome this, we reformulate the 3D challenge as a \textbf{Parallel Probabilistic Landmark Localization} task along the 1D axial dimension. We propose the \textbf{Depth-Sequence Transformer (DST)}, a framework that processes full-resolution CT volumes as sequences of 2D slices, learning to predict independent probability distributions that pinpoint key anatomical landmarks. Our DST framework demonstrates exceptional accuracy and robustness. Evaluated on a 100-patient clinical cohort with rigorous 5-fold cross-validation, it achieves a Mean Absolute Error (MAE) of \textbf{0.1 slices}, with \textbf{96\%} of predictions falling within a slice tolerance. Furthermore, to validate its architectural power, the DST backbone establishes the best result on the public Clean-CC-CCII classification benchmark under an end-to-end evaluation protocol. Our work delivers the first practical tool for automated segment-specific ICAC analysis. The proposed framework provides a foundation for further studies on the role of location-specific biomarkers in diagnosis, prognosis, and procedural planning.

Paper Structure

This paper contains 25 sections, 2 equations, 4 figures, 5 tables, 1 algorithm.

Figures (4)

  • Figure 1: Simplified four‐segment scheme of the internal carotid artery (ICA) used in this study. From proximal to distal, the (i) Cervical, (ii) Petrous, (iii) Cavernous (“syphon”), and (iv) Supraclinoid segments are distinctly color‐coded along the vessel centerline. Lines inside indicate axial reference planes, manually delineated by board‐certified neuroradiologists, that define the boundaries between adjacent segments.
  • Figure 2: The Failure of Patch-Based Models in Tasks Requiring Global Context.(Left) Ground truth labels for left (green) and right (red) ICA segments. (Center) A standard patch-based model (nnU-Net) catastrophically mislabels the hemispheres due to a lack of global context. (Right) Our proposed method correctly utilizes global information to assign anatomically correct labels.
  • Figure 3: An overview of the proposed Depth-Sequence Transformer (DST) framework. (Left Panel) The main pipeline: (A)Slice-wise Encoder to generate a sequence of high-dimensional feature vectors, one for each axial slice. (B) '[cls]' token is added for classification tasks and left-padded to a fixed length. (C) A stack of N DST blocks processes the entire sequence using global self-attention. (D) Finally, two heads produce the outputs: a Classification Head operates on the '[cls]' token, while our primary Localization Head produces N independent probability distributions, each pinpointing one anatomical landmark. (Right Panels) Breakout diagrams of key modules: The Slice-wise Encoder Block (top right) uses anisotropic 3D convolutions to preserve full axial resolution. The DepthAttentionBlock (bottom middle) illustrates our convolution-enhanced self-attention.
  • Figure 4: Excellent correlation between predicted and ground-truth calcium volumes for all intracranial segments. Each scatter plot represents one of the eight ICA segments, plotting the predicted volume (y-axis) against the ground-truth volume (x-axis) on a log-log scale. The dashed line indicates perfect agreement. Our DST-based pipeline demonstrates very high and statistically significant Pearson correlations ($\textit{p} < 0.001$) for all six intracranial segments. Note that, no meaningful correlation is observed for the extracranial Cervical segments; as this anatomical region was at the edge of the scan's field-of-view, leading to near-zero calcification volumes for most subjects.