Transformer-Based Multi-Region Segmentation and Radiomic Analysis of HR-pQCT Imaging for Osteoporosis Classification

Mohseu Rashid Subah; Mohammed Abdul Gani Zilani; Thomas L. Nickolas; Matthew R. Allen; Stuart J. Warden; Rachel K. Surowiec

Transformer-Based Multi-Region Segmentation and Radiomic Analysis of HR-pQCT Imaging for Osteoporosis Classification

Mohseu Rashid Subah, Mohammed Abdul Gani Zilani, Thomas L. Nickolas, Matthew R. Allen, Stuart J. Warden, Rachel K. Surowiec

TL;DR

This work is the first to leverage a transformer-based segmentation architecture, i.e., the SegFormer, for fully automated multi-region HR-pQCT analysis, and demonstrates that automated, multi-region HR-pQCT segmentation enables the extraction of clinically informative signals beyond bone alone.

Abstract

Osteoporosis is a skeletal disease typically diagnosed using dual-energy X-ray absorptiometry (DXA), which quantifies areal bone mineral density but overlooks bone microarchitecture and surrounding soft tissues. High-resolution peripheral quantitative computed tomography (HR-pQCT) enables three-dimensional microstructural imaging with minimal radiation. However, current analysis pipelines largely focus on mineralized bone compartments, leaving much of the acquired image data underutilized. We introduce a fully automated framework for binary osteoporosis classification using radiomics features extracted from anatomically segmented HR-pQCT images. To our knowledge, this work is the first to leverage a transformer-based segmentation architecture, i.e., the SegFormer, for fully automated multi-region HR-pQCT analysis. The SegFormer model simultaneously delineated the cortical and trabecular bone of the tibia and fibula along with surrounding soft tissues and achieved a mean F1 score of 95.36%. Soft tissues were further subdivided into skin, myotendinous, and adipose regions through post-processing. From each region, 939 radiomic features were extracted and dimensionally reduced to train six machine learning classifiers on an independent dataset comprising 20,496 images from 122 HR-pQCT scans. The best image level performance was achieved using myotendinous tissue features, yielding an accuracy of 80.08% and an area under the receiver operating characteristic curve (AUROC) of 0.85, outperforming bone-based models. At the patient level, replacing standard biological, DXA, and HR-pQCT parameters with soft tissue radiomics improved AUROC from 0.792 to 0.875. These findings demonstrate that automated, multi-region HR-pQCT segmentation enables the extraction of clinically informative signals beyond bone alone, highlighting the importance of integrated tissue assessment for osteoporosis detection.

Transformer-Based Multi-Region Segmentation and Radiomic Analysis of HR-pQCT Imaging for Osteoporosis Classification

TL;DR

Abstract

Paper Structure (26 sections, 8 equations, 7 figures, 5 tables)

This paper contains 26 sections, 8 equations, 7 figures, 5 tables.

Introduction
Materials and Methods
Study Population and Data Acquisition
Segmentation Dataset
Classification Dataset
Data Preprocessing
Annotation
Standardization
Segmentation
Architecture
Training and Optimization
Post-processing
Soft Tissue Segmentation
Radiomics Feature Analysis
Feature Extraction
...and 11 more sections

Figures (7)

Figure 1: Input HR-pQCT slice to the SegFormer model (left), five region output of the model highlighting the overall soft tissues in green (middle), and refined output with seven distinct regions after soft tissue segmentation (right). The abbreviations indicated in parentheses are used throughout the text.
Figure 2: Schematic of the proposed SegFormer-based segmentation pipeline. Standardized HR-pQCT inputs are processed through the encoder-decoder framework to generate a five-class semantic map, followed by post-processing and soft-tissue segmentation to obtain the final seven-class output. For each of the four transformer blocks in the encoder, M denotes the number of repeated efficient self-attention and mix feed-forward network units, with M = 3, 4, 18, and 3. In the decoder, H, W, and C denote the height, width, and channel dimension of the input feature maps.
Figure 3: Overview of the proposed osteoporosis classification framework. (a) From each of the seven segmented regions except skin, 939 radiomics features are extracted and reduced through feature selection. (b) Region-specific models are trained using six machine learning classifiers with five-fold cross-validation on 16,464 images and tested on 4,032 held-out images. (c) Different features are grouped into three subsets: non-radiomics, radiomics tibia, and radiomics soft-tissue. Selected features from the subsets are used to construct patient level models (98 training, 24 testing subjects) for binary osteoporosis prediction.
Figure 4: Qualitative semantic segmentation performance on a sample HR-pQCT image (7.3% distal tibia region). The top row shows the full HR-pQCT image encompassing all five regions, and the bottom row highlights the fibula. U-Net–based architectures demonstrate noticeable pixel misclassification, which is substantially reduced with the SegFormer model (highlighted in red). The remaining minor inconsistencies are addressed through post-processing.
Figure 5: Retained features after the feature selection process. (a) Features grouped by filter classes. (b) Features grouped by feature classes.
...and 2 more figures

Transformer-Based Multi-Region Segmentation and Radiomic Analysis of HR-pQCT Imaging for Osteoporosis Classification

TL;DR

Abstract

Transformer-Based Multi-Region Segmentation and Radiomic Analysis of HR-pQCT Imaging for Osteoporosis Classification

Authors

TL;DR

Abstract

Table of Contents

Figures (7)