ChexFract: From General to Specialized -- Enhancing Fracture Description Generation
Nikolay Nechaev, Evgeniia Przhezdzetskaia, Dmitry Umerenkov, Dmitry V. Dylov
TL;DR
The paper tackles the challenge of generating accurate fracture descriptions from chest X-ray radiology reports. It introduces ChexFract, a fracture-focused dataset built via sentence extraction and location-specific templating, and trains fracture-focused vision-language models using domain-specific encoders with Phi-3.5. The study demonstrates that end-to-end encoder adaptation and templated supervision yield meaningful gains over general-purpose radiology models, achieving ROC-AUC up to 0.715 and improved F1 for fracture detection. The authors publicly release their best-performing fracture-reporting models and discuss clinical implications, including a recall-precision tradeoff suitable for screening workflows with radiologist review.
Abstract
Generating accurate and clinically meaningful radiology reports from chest X-ray images remains a significant challenge in medical AI. While recent vision-language models achieve strong results in general radiology report generation, they often fail to adequately describe rare but clinically important pathologies like fractures. This work addresses this gap by developing specialized models for fracture pathology detection and description. We train fracture-specific vision-language models with encoders from MAIRA-2 and CheXagent, demonstrating significant improvements over general-purpose models in generating accurate fracture descriptions. Analysis of model outputs by fracture type, location, and age reveals distinct strengths and limitations of current vision-language model architectures. We publicly release our best-performing fracture-reporting model, facilitating future research in accurate reporting of rare pathologies.
