Table of Contents
Fetching ...

A Systematic Analysis of Input Modalities for Fracture Classification of the Paediatric Wrist

Ron Keuth, Maren Balks, Sebastian Tschauner, Ludger Tüshaus, Mattias Heinrich

TL;DR

This study addresses pediatric distal forearm fracture classification by extending radiograph-based models to a multimodal framework. It fuses spatial information from radiographs, automatic bone segmentation, fracture-location heatmaps, and radiology-report embeddings (via CLIP) to predict AO/OTA classes in a multilabel setting. The results show that integrating all modalities yields the best performance (AUROC up to 93.26), with fracture-location information driving the most significant gains, while segmentation quality and report-only cues have limited impact. The work highlights potential for improving clinical decision support in paediatric fracture management and provides code and a public dataset foundation for further research.

Abstract

Fractures, particularly in the distal forearm, are among the most common injuries in children and adolescents, with approximately 800 000 cases treated annually in Germany. The AO/OTA system provides a structured fracture type classification, which serves as the foundation for treatment decisions. Although accurately classifying fractures can be challenging, current deep learning models have demonstrated performance comparable to that of experienced radiologists. While most existing approaches rely solely on radiographs, the potential impact of incorporating other additional modalities, such as automatic bone segmentation, fracture location, and radiology reports, remains underexplored. In this work, we systematically analyse the contribution of these three additional information types, finding that combining them with radiographs increases the AUROC from 91.71 to 93.25. Our code is available on GitHub.

A Systematic Analysis of Input Modalities for Fracture Classification of the Paediatric Wrist

TL;DR

This study addresses pediatric distal forearm fracture classification by extending radiograph-based models to a multimodal framework. It fuses spatial information from radiographs, automatic bone segmentation, fracture-location heatmaps, and radiology-report embeddings (via CLIP) to predict AO/OTA classes in a multilabel setting. The results show that integrating all modalities yields the best performance (AUROC up to 93.26), with fracture-location information driving the most significant gains, while segmentation quality and report-only cues have limited impact. The work highlights potential for improving clinical decision support in paediatric fracture management and provides code and a public dataset foundation for further research.

Abstract

Fractures, particularly in the distal forearm, are among the most common injuries in children and adolescents, with approximately 800 000 cases treated annually in Germany. The AO/OTA system provides a structured fracture type classification, which serves as the foundation for treatment decisions. Although accurately classifying fractures can be challenging, current deep learning models have demonstrated performance comparable to that of experienced radiologists. While most existing approaches rely solely on radiographs, the potential impact of incorporating other additional modalities, such as automatic bone segmentation, fracture location, and radiology reports, remains underexplored. In this work, we systematically analyse the contribution of these three additional information types, finding that combining them with radiographs increases the AUROC from 91.71 to 93.25. Our code is available on GitHub.

Paper Structure

This paper contains 8 sections, 1 equation, 2 figures, 1 table.

Figures (2)

  • Figure 1.1: Usage of the four modalities for fracture classification. The radiograph, the bone segmentation, and the heatmap encoding the fractures' location are feed to the ResNet18. The report embedding of the frozen CLIP text encoder $\mathbf{z}_\text{text}\in\mathbb{R}^{512}$ is fused to the ResNet's latent vector $\mathbf{z}_\text{spatial}\in\mathbb{R}^{512}$. $\bigoplus$ describes channel-wise concatenation.
  • Figure 1.2: AUROC comparison for fracture types (AO/OTA codes in captions) shows the benefit of including fracture location as input. Blue: Img, orange: Img + FracLoc, green: all modalities.