Table of Contents
Fetching ...

TCIP: Threshold-Controlled Iterative Pyramid Network for Deformable Medical Image Registration

Heming Wu, Di Wang, Tai Ma, Peng Zhao, Yubin Xiao, Zhongke Wu, Xing-Ce Wang, Chuang Li, Xuan Wu, You Zhou

TL;DR

TCIP tackles misalignment propagation and inefficient iteration in deformable medical image registration by coupling a lightweight Feature-Enhanced Residual Module (FERM) with a dual-stage Threshold-Controlled Iterative (TCI) strategy in a pyramid framework. FERM enhances decoding layers by fusing features, applying 3D channel attention, and producing deformation fields, while TCI adaptively determines when to stop iterations based on stability and convergence measured via Normalized Cross-Correlation. The approach achieves state-of-the-art Dice scores across four datasets (three brain MRI, one abdomen CT) with comparable speed and substantially fewer parameters, and ablation studies confirm the effectiveness of FERM and TCI. Overall, TCIP provides a robust, efficient solution for multi-scale deformable registration with strong generalizability and potential for further refinements in attention mechanisms and local region weighting.

Abstract

Although pyramid networks have demonstrated superior performance in deformable medical image registration, their decoder architectures are inherently prone to propagating and accumulating anatomical structure misalignments. Moreover, most existing models do not adaptively determine the number of iterations for optimization under varying deformation requirements across images, resulting in either premature termination or excessive iterations that degrades registration accuracy. To effectively mitigate the accumulation of anatomical misalignments, we propose the Feature-Enhanced Residual Module (FERM) as the core component of each decoding layer in the pyramid network. FERM comprises three sequential blocks that extract anatomical semantic features, learn to suppress irrelevant features, and estimate the final deformation field, respectively. To adaptively determine the number of iterations for varying images, we propose the dual-stage Threshold-Controlled Iterative (TCI) strategy. In the first stage, TCI assesses registration stability and with asserted stability, it continues with the second stage to evaluate convergence. We coin the model that integrates FERM and TCI as Threshold-Controlled Iterative Pyramid (TCIP). Extensive experiments on three public brain MRI datasets and one abdomen CT dataset demonstrate that TCIP outperforms the state-of-the-art (SOTA) registration networks in terms of accuracy, while maintaining comparable inference speed and a compact model parameter size. Finally, we assess the generalizability of FERM and TCI by integrating them with existing registration networks and further conduct ablation studies to validate the effectiveness of these two proposed methods.

TCIP: Threshold-Controlled Iterative Pyramid Network for Deformable Medical Image Registration

TL;DR

TCIP tackles misalignment propagation and inefficient iteration in deformable medical image registration by coupling a lightweight Feature-Enhanced Residual Module (FERM) with a dual-stage Threshold-Controlled Iterative (TCI) strategy in a pyramid framework. FERM enhances decoding layers by fusing features, applying 3D channel attention, and producing deformation fields, while TCI adaptively determines when to stop iterations based on stability and convergence measured via Normalized Cross-Correlation. The approach achieves state-of-the-art Dice scores across four datasets (three brain MRI, one abdomen CT) with comparable speed and substantially fewer parameters, and ablation studies confirm the effectiveness of FERM and TCI. Overall, TCIP provides a robust, efficient solution for multi-scale deformable registration with strong generalizability and potential for further refinements in attention mechanisms and local region weighting.

Abstract

Although pyramid networks have demonstrated superior performance in deformable medical image registration, their decoder architectures are inherently prone to propagating and accumulating anatomical structure misalignments. Moreover, most existing models do not adaptively determine the number of iterations for optimization under varying deformation requirements across images, resulting in either premature termination or excessive iterations that degrades registration accuracy. To effectively mitigate the accumulation of anatomical misalignments, we propose the Feature-Enhanced Residual Module (FERM) as the core component of each decoding layer in the pyramid network. FERM comprises three sequential blocks that extract anatomical semantic features, learn to suppress irrelevant features, and estimate the final deformation field, respectively. To adaptively determine the number of iterations for varying images, we propose the dual-stage Threshold-Controlled Iterative (TCI) strategy. In the first stage, TCI assesses registration stability and with asserted stability, it continues with the second stage to evaluate convergence. We coin the model that integrates FERM and TCI as Threshold-Controlled Iterative Pyramid (TCIP). Extensive experiments on three public brain MRI datasets and one abdomen CT dataset demonstrate that TCIP outperforms the state-of-the-art (SOTA) registration networks in terms of accuracy, while maintaining comparable inference speed and a compact model parameter size. Finally, we assess the generalizability of FERM and TCI by integrating them with existing registration networks and further conduct ablation studies to validate the effectiveness of these two proposed methods.

Paper Structure

This paper contains 19 sections, 12 equations, 9 figures, 11 tables, 1 algorithm.

Figures (9)

  • Figure 1: Illustration of the differences in decoding processes between existing pyramid models and our proposed TCIP. Sub-figure (a) portrays the existing models b26 and b24, which neither effectively prevent propagation and accumulation of anatomical structure misalignments nor adaptively determine the number of iterations. Sub-figure (b) depicts the proposed TCIP. At each decoding layer, TCIP employs our designed FERM to learn to suppress irrelevant features, thereby reducing misalignment accumulation. Subsequently, TCIP employs our novel dual-stage TCI strategy to adaptively determine the number of iterations for guiding FERM in the progressive optimization process of the deformation field.
  • Figure 2: Overall TCIP model architecture. A weight-sharing encoder first extracts multi-scale feature maps $\{F_l\}$ and $\{M_l\}$ for the fixed image $I_f$ and the moving image $I_m$, respectively. In the subsequent decoding layers, TCIP employs FERM to emphasize informative features and estimate deformation field $\phi_l$, while TCI adaptively determines the number of iterations for each decoding layer, enabling progressive optimization of $\phi_l$. Notably, the encoder processes higher-resolution feature maps first, while the decoder begins deformation estimation on the lowest‑resolution maps and proceeds to finer scales.
  • Figure 3: Depiction of the proposed FERM, comprising the FFB, SEB, and DeF modules. FFB is responsible for extracting anatomical structure details. SEB suppresses irrelevant features to mitigate anatomical structure misalignment accumulation. DeF estimates the deformation fields.
  • Figure 4: Depiction of the proposed TCI strategy, which determines when to stop the iteration through a dual-stage mechanism. TCI first checks whether the std $\varepsilon_l$ of similarity differences across all historical registration results in $W_l$ is lower than a threshold $\delta_s$. If yes, TCI further computes the change $\Delta s$ between similarity differences of the two most recent registration results in $W_l$. If $\Delta s$ is smaller than another threshold $\delta_c$, it is considered that the appropriate stopping criteria have been reached.
  • Figure 5: Visualization of registration results and corresponding deformation fields across different models.
  • ...and 4 more figures