Table of Contents
Fetching ...

Structured Spectral Graph Representation Learning for Multi-label Abnormality Analysis from 3D CT Scans

Theo Di Piazza, Carole Lazarus, Olivier Nempont, Loic Boussel

TL;DR

CT-SSG introduces a Structured Spectral Graph approach that models 3D CT volumes as graphs of triplet axial slices, enabling spectral-domain reasoning over inter-slice dependencies with a 2.5D representation. By using Chebyshev convolutions on a sparsely connected graph and axial positional embeddings, the method achieves strong cross-dataset generalization for multi-label chest abnormality classification while remaining computationally efficient for clinical deployment. The work also demonstrates transferability to automated radiology report generation and cross-domain abdominal CT data with data-efficient linear probing, backed by thorough ablation and robustness analyses. These findings suggest that explicitly structured, spectral graph priors can provide robust and transferable representations for 3D medical imaging tasks beyond segmentation, with practical implications for scalable clinical tools. Overall, CT-SSG offers a versatile backbone that bridges 2D slice-based features and full volumetric modeling, balancing expressiveness and efficiency in real-world workflows.

Abstract

With the growing volume of CT examinations, there is an increasing demand for automated tools such as organ segmentation, abnormality detection, and report generation to support radiologists in managing their clinical workload. Multi-label classification of 3D Chest CT scans remains a critical yet challenging problem due to the complex spatial relationships inherent in volumetric data and the wide variability of abnormalities. Existing methods based on 3D convolutional neural networks struggle to capture long-range dependencies, while Vision Transformers often require extensive pre-training on large-scale, domain-specific datasets to perform competitively. In this work of academic research, we propose a 2.5D alternative by introducing a new graph-based framework that represents 3D CT volumes as structured graphs, where axial slice triplets serve as nodes processed through spectral graph convolution, enabling the model to reason over inter-slice dependencies while maintaining complexity compatible with clinical deployment. Our method, trained and evaluated on 3 datasets from independent institutions, achieves strong cross-dataset generalization, and shows competitive performance compared to state-of-the-art visual encoders. We further conduct comprehensive ablation studies to evaluate the impact of various aggregation strategies, edge-weighting schemes, and graph connectivity patterns. Additionally, we demonstrate the broader applicability of our approach through transfer experiments on automated radiology report generation and abdominal CT data.

Structured Spectral Graph Representation Learning for Multi-label Abnormality Analysis from 3D CT Scans

TL;DR

CT-SSG introduces a Structured Spectral Graph approach that models 3D CT volumes as graphs of triplet axial slices, enabling spectral-domain reasoning over inter-slice dependencies with a 2.5D representation. By using Chebyshev convolutions on a sparsely connected graph and axial positional embeddings, the method achieves strong cross-dataset generalization for multi-label chest abnormality classification while remaining computationally efficient for clinical deployment. The work also demonstrates transferability to automated radiology report generation and cross-domain abdominal CT data with data-efficient linear probing, backed by thorough ablation and robustness analyses. These findings suggest that explicitly structured, spectral graph priors can provide robust and transferable representations for 3D medical imaging tasks beyond segmentation, with practical implications for scalable clinical tools. Overall, CT-SSG offers a versatile backbone that bridges 2D slice-based features and full volumetric modeling, balancing expressiveness and efficiency in real-world workflows.

Abstract

With the growing volume of CT examinations, there is an increasing demand for automated tools such as organ segmentation, abnormality detection, and report generation to support radiologists in managing their clinical workload. Multi-label classification of 3D Chest CT scans remains a critical yet challenging problem due to the complex spatial relationships inherent in volumetric data and the wide variability of abnormalities. Existing methods based on 3D convolutional neural networks struggle to capture long-range dependencies, while Vision Transformers often require extensive pre-training on large-scale, domain-specific datasets to perform competitively. In this work of academic research, we propose a 2.5D alternative by introducing a new graph-based framework that represents 3D CT volumes as structured graphs, where axial slice triplets serve as nodes processed through spectral graph convolution, enabling the model to reason over inter-slice dependencies while maintaining complexity compatible with clinical deployment. Our method, trained and evaluated on 3 datasets from independent institutions, achieves strong cross-dataset generalization, and shows competitive performance compared to state-of-the-art visual encoders. We further conduct comprehensive ablation studies to evaluate the impact of various aggregation strategies, edge-weighting schemes, and graph connectivity patterns. Additionally, we demonstrate the broader applicability of our approach through transfer experiments on automated radiology report generation and abdominal CT data.

Paper Structure

This paper contains 52 sections, 10 equations, 15 figures, 9 tables.

Figures (15)

  • Figure 1: Axial slices from 3D CT Scans, with abnormalities manually contoured in red, illustrating distinct visual characteristics.
  • Figure 2: CT-SSG Architecture Overview. Adjacent axial slices are grouped into triplets, each representing a node in a graph. Edges between nodes are weighted according to their physical distance along the z-axis. Node features are enhanced with Triplet Axial Slices positional embeddings, and then processed by a Spectral Block that incorporates Chebyshev graph convolution for structured spectral modeling. The resulting node representations are aggregated via mean pooling and passed to a classification head to predict abnormalities.
  • Figure 3: Spectral Block with detailed notations. Input features are given to a first normalization layer, followed by spectral graph convolutions with a residual skip connection. These updated features are then fed to a feedforward neural network followed by a second normalization layer with a residual skip connection.
  • Figure 4: Comprehensive analysis of the datasets. Metadata not available for the Rad-ChestCT dataset. a) Abnormalities from CT-HCL are extracted with a BERT-based language model trained on french radiology reports from manually extracted anotations. b) CT-HCL comprises data from 2,000 unique patients, with age randing from 20 to 100 years. c) CT-HCL volumes comes from Hospices Civil de Lyon, with scanners from four manufacturers. d) CT-HCL volumes were acquired both from male and female patients.
  • Figure 5: F1-Score per abnormality for the 18 abnormalities from the CT-RATE test set, comparing our proposed CT-SSG with representative 3D Convolutional and 3D Transformer baselines. For clarity, one representative model per family is reported. CT-SSG consistently improves over both baselines, with the largest absolute gains observed in Pericardial effusion (+$\Delta$8.96%), Calcification (+$\Delta$6.23%), and Pleural effusion (+$\Delta$6.20%).
  • ...and 10 more figures