OrthoInsight: Rib Fracture Diagnosis and Report Generation Based on Multi-Modal Large Models

Jinzhi Wang; Jiangbo Zhang; Chenzhan Yu; Zhigang Xiu; Duwei Dai; Ziyu xu; Ningyong Wu; Wenhong Zhao

OrthoInsight: Rib Fracture Diagnosis and Report Generation Based on Multi-Modal Large Models

Jinzhi Wang, Jiangbo Zhang, Chenzhan Yu, Zhigang Xiu, Duwei Dai, Ziyu xu, Ningyong Wu, Wenhong Zhao

TL;DR

OrthoInsight tackles rib fracture diagnosis from CT by fusing YOLOv9 detection, orthopedic knowledge retrieval, and fine-tuned LLaVA report generation. The framework leverages a 28,675-image dataset plus expert reports and a knowledge graph to produce detailed, clinically actionable diagnostic reports. It achieves high scores across Diagnostic Accuracy, Content Completeness, Logical Coherence, and Clinical Guidance Value (avg 4.28) and outperforms GPT-4 and Claude-3. The results illustrate the potential of multi-modal large models for automated medical imaging interpretation and radiologist support, while acknowledging that this is research-oriented and not a substitute for clinician judgment.

Abstract

The growing volume of medical imaging data has increased the need for automated diagnostic tools, especially for musculoskeletal injuries like rib fractures, commonly detected via CT scans. Manual interpretation is time-consuming and error-prone. We propose OrthoInsight, a multi-modal deep learning framework for rib fracture diagnosis and report generation. It integrates a YOLOv9 model for fracture detection, a medical knowledge graph for retrieving clinical context, and a fine-tuned LLaVA language model for generating diagnostic reports. OrthoInsight combines visual features from CT images with expert textual data to deliver clinically useful outputs. Evaluated on 28,675 annotated CT images and expert reports, it achieves high performance across Diagnostic Accuracy, Content Completeness, Logical Coherence, and Clinical Guidance Value, with an average score of 4.28, outperforming models like GPT-4 and Claude-3. This study demonstrates the potential of multi-modal learning in transforming medical image analysis and providing effective support for radiologists.

OrthoInsight: Rib Fracture Diagnosis and Report Generation Based on Multi-Modal Large Models

TL;DR

Abstract

OrthoInsight: Rib Fracture Diagnosis and Report Generation Based on Multi-Modal Large Models

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (4)