iMedImage Technical Report
Ran Wei, ZhiXiong Lan, Qing Yan, Ning Song, Ming Lv, LongQing Ye
TL;DR
This work presents iMedImage, a general-purpose foundation model for medical imaging that unifies multimodal inputs and tasks. It combines a unified representation, Transformer-based processing, multi-level analysis, Chain of Thought embedding, and Mixture of Experts to handle diverse data from chromosomes to CT. On a large-scale, multi-center Chinese dataset, it achieves $92.75%$ sensitivity and $91.5%$ specificity in automated chromosome abnormality detection and demonstrates strong performance in BreastMNIST, preterm birth prediction, and pancreatic cancer recurrence prediction without ROI annotation. The results support broad generalizability and potential for low-cost transfer to clinical workflows, with future work including expanding modalities and prospective clinical validation.
Abstract
Background: Chromosome karyotype analysis is crucial for diagnosing hereditary diseases, yet detecting structural abnormalities remains challenging. While AI has shown promise in medical imaging, its effectiveness varies across modalities. Leveraging advances in Foundation Models that integrate multimodal medical imaging for robust feature extraction and accurate diagnosis, we developed iMedImage, an end-to-end model for general medical image recognition, demonstrating strong performance across multiple imaging tasks, including chromosome abnormality detection. Materials and Methods: We constructed a comprehensive medical image dataset encompassing multiple modalities from common medical domains, including chromosome, cell, pathology, ultrasound, X-ray, CT, and MRI images. Based on this dataset, we developed the iMedImage model, which incorporates the following key features: (1) a unified representation method for diverse modality inputs and medical imaging tasks; (2) multi-level (case-level, image-level, patch-level) image recognition capabilities enhanced by Chain of Thought (CoT) embedding and Mixture of Experts (MoE) strategies. Results: The test set comprised data from 12 institutions across six regions in China, covering three mainstream scanning devices, and included naturally distributed, unscreened abnormal cases. On this diverse dataset, the model achieved a fully automated chromosome analysis workflow, including segmentation, karyotyping, and abnormality detection, reaching a sensitivity of 92.75% and a specificity of 91.5%. Conclusion: We propose iMedImage, an end-to-end foundation model for medical image analysis, demonstrating its superior performance across various medical imaging tasks. iMedImage provides clinicians with a precise imaging analysis tool and contributes to improving diagnostic accuracy and disease screening.
