ModeTv2: GPU-accelerated Motion Decomposition Transformer for Pairwise Optimization in Medical Image Registration
Haiqiao Wang, Zhuoyuan Wang, Dong Ni, Yi Wang
TL;DR
ModeTv2 tackles deformable image registration by delivering a GPU-accelerated, interpretable operator that enables pairwise optimization with high accuracy and efficiency. It introduces a pyramid-directed architecture combining a GPU-accelerated Motion Decomposition Transformer (ModeT) with a lightweight RegHead to fuse multiple motion subfields into the total deformation field $φ$, optionally via a diffeomorphic layer. The approach achieves state-of-the-art registration performance and fast convergence across four public datasets, with strong pairwise optimization in same-domain and cross-domain settings, thanks to CUDA-accelerated computations and inductive biases aligned with registration tasks. This work enhances usability and generalization of DL-based DIR, offering a practical, scalable solution for clinical image registration and potential extensions to multi-modal scenarios.
Abstract
Deformable image registration plays a crucial role in medical imaging, aiding in disease diagnosis and image-guided interventions. Traditional iterative methods are slow, while deep learning (DL) accelerates solutions but faces usability and precision challenges. This study introduces a pyramid network with the enhanced motion decomposition Transformer (ModeTv2) operator, showcasing superior pairwise optimization (PO) akin to traditional methods. We re-implement ModeT operator with CUDA extensions to enhance its computational efficiency. We further propose RegHead module which refines deformation fields, improves the realism of deformation and reduces parameters. By adopting the PO, the proposed network balances accuracy, efficiency, and generalizability. Extensive experiments on three public brain MRI datasets and one abdominal CT dataset demonstrate the network's suitability for PO, providing a DL model with enhanced usability and interpretability. The code is publicly available at https://github.com/ZAX130/ModeTv2.
