Table of Contents
Fetching ...

Deep Learning-Based BMD Estimation from Radiographs with Conformal Uncertainty Quantification

Long Hui, Wai Lok Yeung

TL;DR

This work tackles opportunistic osteoporosis screening by estimating Bone Mineral Density (BMD) from knee radiographs using an EfficientNetV2-based regression model trained on the Osteoarthritis Initiative (OAI) dataset. It couples deep learning with Split Conformal Prediction to produce patient-specific prediction intervals, enabling statistically guaranteed uncertainty quantification, and compares traditional Test-Time Augmentation with a multi-sample approach. The results show a moderate predictive relationship ($R \approx 0.68$) and empirically valid coverage (e.g., $94.8\%$ at 95% and $98.7\%$ at 99%), with multi-sample TTA offering some interval-tightening benefits. While anatomical site mismatch limits immediate clinical adoption, the framework demonstrates a robust methodology for trustworthy AI-driven BMD screening using routine radiographs and sets a clear path for future improvements via data-centric AI and domain adaptation.

Abstract

Limited DXA access hinders osteoporosis screening. This proof-of-concept study proposes using widely available knee X-rays for opportunistic Bone Mineral Density (BMD) estimation via deep learning, emphasizing robust uncertainty quantification essential for clinical use. An EfficientNet model was trained on the OAI dataset to predict BMD from bilateral knee radiographs. Two Test-Time Augmentation (TTA) methods were compared: traditional averaging and a multi-sample approach. Crucially, Split Conformal Prediction was implemented to provide statistically rigorous, patient-specific prediction intervals with guaranteed coverage. Results showed a Pearson correlation of 0.68 (traditional TTA). While traditional TTA yielded better point predictions, the multi-sample approach produced slightly tighter confidence intervals (90%, 95%, 99%) while maintaining coverage. The framework appropriately expressed higher uncertainty for challenging cases. Although anatomical mismatch between knee X-rays and standard DXA limits immediate clinical use, this method establishes a foundation for trustworthy AI-assisted BMD screening using routine radiographs, potentially improving early osteoporosis detection.

Deep Learning-Based BMD Estimation from Radiographs with Conformal Uncertainty Quantification

TL;DR

This work tackles opportunistic osteoporosis screening by estimating Bone Mineral Density (BMD) from knee radiographs using an EfficientNetV2-based regression model trained on the Osteoarthritis Initiative (OAI) dataset. It couples deep learning with Split Conformal Prediction to produce patient-specific prediction intervals, enabling statistically guaranteed uncertainty quantification, and compares traditional Test-Time Augmentation with a multi-sample approach. The results show a moderate predictive relationship () and empirically valid coverage (e.g., at 95% and at 99%), with multi-sample TTA offering some interval-tightening benefits. While anatomical site mismatch limits immediate clinical adoption, the framework demonstrates a robust methodology for trustworthy AI-driven BMD screening using routine radiographs and sets a clear path for future improvements via data-centric AI and domain adaptation.

Abstract

Limited DXA access hinders osteoporosis screening. This proof-of-concept study proposes using widely available knee X-rays for opportunistic Bone Mineral Density (BMD) estimation via deep learning, emphasizing robust uncertainty quantification essential for clinical use. An EfficientNet model was trained on the OAI dataset to predict BMD from bilateral knee radiographs. Two Test-Time Augmentation (TTA) methods were compared: traditional averaging and a multi-sample approach. Crucially, Split Conformal Prediction was implemented to provide statistically rigorous, patient-specific prediction intervals with guaranteed coverage. Results showed a Pearson correlation of 0.68 (traditional TTA). While traditional TTA yielded better point predictions, the multi-sample approach produced slightly tighter confidence intervals (90%, 95%, 99%) while maintaining coverage. The framework appropriately expressed higher uncertainty for challenging cases. Although anatomical mismatch between knee X-rays and standard DXA limits immediate clinical use, this method establishes a foundation for trustworthy AI-assisted BMD screening using routine radiographs, potentially improving early osteoporosis detection.

Paper Structure

This paper contains 28 sections, 5 figures, 1 table.

Figures (5)

  • Figure 1: Conceptual illustration of a symmetric conformal prediction interval for a regression task. $\hat{Y}_{test}$ is the model's point prediction, and $q_{1-\alpha}$ is derived from the calibration set to ensure at least $1-\alpha$ coverage. The prediction interval $[\hat{Y}_{test} - q_{1-\alpha}, \hat{Y}_{test} + q_{1-\alpha}]$ guarantees that $P(Y_{test} \in \text{Interval}) \geq 1-\alpha$.
  • Figure 2: Example of a knee radiograph from the OAI dataset before preprocessing.
  • Figure 3: Example of left and right knee radiographs after splitting the original image during preprocessing.
  • Figure 4: Architecture of the EfficientNetV2-based BMD regression model.
  • Figure 5: Scatter plot of true vs. predicted BMD values using the combined model with TTA. Points are color-coded by absolute error, with lighter colors indicating larger prediction errors. The perfect prediction line (red dashed) represents where predictions would exactly match true values. The Pearson correlation coefficient of 0.68 indicates a moderate positive correlation between predicted and actual BMD values.