MR-Transformer: Vision Transformer for Total Knee Replacement Prediction Using Magnetic Resonance Imaging
Chaojie Zhang, Shengjia Chen, Ozkan Cigdem, Haresh Rengaraj Rajamohan, Kyunghyun Cho, Richard Kijowski, Cem M. Deniz
TL;DR
The paper introduces MR-Transformer, a Vision Transformer-based model adapted to 3D knee MRI to predict total knee replacement, leveraging ImageNet pre-training to capture long-range and 3D spatial information. Trained on matched case–control cohorts from the OAI and MOST databases across multiple MRI contrasts, it demonstrates state-of-the-art AUC performance relative to MRNet, TSE, and 3DMeT on several tissue contrasts, with statistically significant gains in key sequences. The approach provides interpretable attention maps and highlights the joint regions most informative for TKR risk while acknowledging substantial computational demands from self-attention on large 3D inputs. These findings support the potential of pre-trained Vision Transformers for small medical datasets and MRI-based prognostic tasks, with opportunities for efficiency-focused future work and broader applications.
Abstract
A transformer-based deep learning model, MR-Transformer, was developed for total knee replacement (TKR) prediction using magnetic resonance imaging (MRI). The model incorporates the ImageNet pre-training and captures three-dimensional (3D) spatial correlation from the MR images. The performance of the proposed model was compared to existing state-of-the-art deep learning models for knee injury diagnosis using MRI. Knee MR scans of four different tissue contrasts from the Osteoarthritis Initiative and Multicenter Osteoarthritis Study databases were utilized in the study. Experimental results demonstrated the state-of-the-art performance of the proposed model on TKR prediction using MRI.
