Feasibility Study of a Diffusion-Based Model for Cross-Modal Generation of Knee MRI from X-ray: Integrating Radiographic Feature Information
Zhe Wang, Yung Hsin Chen, Aladine Chetouani, Fabian Bauer, Yuhua Ru, Fang Chen, Liping Zhang, Rachid Jennane, Mohamed Jarraya
TL;DR
This study tackles the gap between knee X-ray accessibility and MRI's soft-tissue diagnostic detail by proposing a diffusion-based cross-modal framework to synthesize knee MRI volumes from X-ray inputs. The approach combines a classical conditional latent diffusion model with an AutoencoderKL-based latent space and a guidance module that injects target depth and patient-specific radiographic features, enabling 3D MRI volume generation from 2D inputs. Results show MRI slices generated by the method are visually closer to real MRI and exhibit improved region-specific fidelity (e.g., KOA-related features) compared with state-of-the-art diffusion baselines; ablations confirm the added value of radiographic guidance, and increasing inference steps improves inter-slice continuity. The work demonstrates a feasible, data-driven bridge between X-ray and MRI modalities, with potential to enhance access to MRI-like insights in resource-limited settings, while acknowledging that the generated MRI is not a clinical replacement and relies on large paired datasets and substantial compute.
Abstract
Knee osteoarthritis (KOA) is a prevalent musculoskeletal disorder, often diagnosed using X-rays due to its cost-effectiveness. While Magnetic Resonance Imaging (MRI) provides superior soft tissue visualization and serves as a valuable supplementary diagnostic tool, its high cost and limited accessibility significantly restrict its widespread use. To explore the feasibility of bridging this imaging gap, we conducted a feasibility study leveraging a diffusion-based model that uses an X-ray image as conditional input, alongside target depth and additional patient-specific feature information, to generate corresponding MRI sequences. Our findings demonstrate that the MRI volumes generated by our approach is visually closer to real MRI scans. Moreover, increasing inference steps enhances the continuity and smoothness of the synthesized MRI sequences. Through ablation studies, we further validate that integrating supplementary patient-specific information, beyond what X-rays alone can provide, enhances the accuracy and clinical relevance of the generated MRI, which underscores the potential of leveraging external patient-specific information to improve the MRI generation. This study is available at https://zwang78.github.io/.
