Attention-Driven Framework for Non-Rigid Medical Image Registration
Muhammad Zafar Iqbal, Ghazanfar Farooq Siddiqui, Anwar Ul Haq, Imran Razzak
TL;DR
Deformable medical image registration faces large deformations and anatomical plausibility challenges. The paper introduces AD-RegNet, an attention-driven framework that combines a 3D U-Net backbone, a bidirectional cross-attention module, regional adaptive attention, and multi-resolution deformation field synthesis to estimate a dense displacement field $\ Phi \in \mathbb{R}^{3 \times D \times H \times W}$ with $I_R = \mathcal{T}(I_M, \Phi)$. Training optimizes a multi-term loss $\mathcal{L} = \lambda_{sim} \mathcal{L}_{sim} + \lambda_{reg} \mathcal{L}_{reg} + \lambda_{landmark} \mathcal{L}_{landmark}$, and evaluations on DIRLab and IXI show competitive TRE (1.51 mm) and DSC (0.759) with highly plausible deformations (mean Jacobian near 1 and minimal negative Jacobians). The approach demonstrates cross-modality robustness (lung CT and brain MRI) and positions attention-guided registration as ready for clinical contexts such as disease diagnosis and image-guided interventions.
Abstract
Deformable medical image registration is a fundamental task in medical image analysis with applications in disease diagnosis, treatment planning, and image-guided interventions. Despite significant advances in deep learning based registration methods, accurately aligning images with large deformations while preserving anatomical plausibility remains a challenging task. In this paper, we propose a novel Attention-Driven Framework for Non-Rigid Medical Image Registration (AD-RegNet) that employs attention mechanisms to guide the registration process. Our approach combines a 3D UNet backbone with bidirectional cross-attention, which establishes correspondences between moving and fixed images at multiple scales. We introduce a regional adaptive attention mechanism that focuses on anatomically relevant structures, along with a multi-resolution deformation field synthesis approach for accurate alignment. The method is evaluated on two distinct datasets: DIRLab for thoracic 4D CT scans and IXI for brain MRI scans, demonstrating its versatility across different anatomical structures and imaging modalities. Experimental results demonstrate that our approach achieves performance competitive with state-of-the-art methods on the IXI and DIRLab datasets. The proposed method maintains a favorable balance between registration accuracy and computational efficiency, making it suitable for clinical applications. A comprehensive evaluation using normalized cross-correlation (NCC), mean squared error (MSE), structural similarity (SSIM), Jacobian determinant, and target registration error (TRE) indicates that attention-guided registration improves alignment accuracy while ensuring anatomically plausible deformations.
