Table of Contents
Fetching ...

SACB-Net: Spatial-awareness Convolutions for Medical Image Registration

Xinxing Cheng, Tianyang Zhang, Wenqi Lu, Qingjie Meng, Alejandro F. Frangi, Jinming Duan

TL;DR

SACB-Net addresses the challenge of capturing spatially varying information in 3D medical image registration by introducing a spatial-awareness convolution block (SACB) that generates region-specific adaptive kernels via feature-space clustering. The method integrates SACB into a pyramid flow estimator to enable multi-scale deformation estimation, improving large-deformation handling while maintaining plausible field regularity. Across brain atlas, inter-subject brain, and abdomen CT datasets, SACB-Net achieves state-of-the-art or competitive Dice and deformation metrics with a compact parameter footprint, and demonstrates that the SACB-based flow estimator can serve as a plug-in for other architectures. The work highlights the practical benefit of region-aware convolution in registration and suggests broader applicability of spatial-adaptive kernels for learning-based deformation estimation.

Abstract

Deep learning-based image registration methods have shown state-of-the-art performance and rapid inference speeds. Despite these advances, many existing approaches fall short in capturing spatially varying information in non-local regions of feature maps due to the reliance on spatially-shared convolution kernels. This limitation leads to suboptimal estimation of deformation fields. In this paper, we propose a 3D Spatial-Awareness Convolution Block (SACB) to enhance the spatial information within feature representations. Our SACB estimates the spatial clusters within feature maps by leveraging feature similarity and subsequently parameterizes the adaptive convolution kernels across diverse regions. This adaptive mechanism generates the convolution kernels (weights and biases) tailored to spatial variations, thereby enabling the network to effectively capture spatially varying information. Building on SACB, we introduce a pyramid flow estimator (named SACB-Net) that integrates SACBs to facilitate multi-scale flow composition, particularly addressing large deformations. Experimental results on the brain IXI and LPBA datasets as well as Abdomen CT datasets demonstrate the effectiveness of SACB and the superiority of SACB-Net over the state-of-the-art learning-based registration methods. The code is available at https://github.com/x-xc/SACB_Net .

SACB-Net: Spatial-awareness Convolutions for Medical Image Registration

TL;DR

SACB-Net addresses the challenge of capturing spatially varying information in 3D medical image registration by introducing a spatial-awareness convolution block (SACB) that generates region-specific adaptive kernels via feature-space clustering. The method integrates SACB into a pyramid flow estimator to enable multi-scale deformation estimation, improving large-deformation handling while maintaining plausible field regularity. Across brain atlas, inter-subject brain, and abdomen CT datasets, SACB-Net achieves state-of-the-art or competitive Dice and deformation metrics with a compact parameter footprint, and demonstrates that the SACB-based flow estimator can serve as a plug-in for other architectures. The work highlights the practical benefit of region-aware convolution in registration and suggests broader applicability of spatial-adaptive kernels for learning-based deformation estimation.

Abstract

Deep learning-based image registration methods have shown state-of-the-art performance and rapid inference speeds. Despite these advances, many existing approaches fall short in capturing spatially varying information in non-local regions of feature maps due to the reliance on spatially-shared convolution kernels. This limitation leads to suboptimal estimation of deformation fields. In this paper, we propose a 3D Spatial-Awareness Convolution Block (SACB) to enhance the spatial information within feature representations. Our SACB estimates the spatial clusters within feature maps by leveraging feature similarity and subsequently parameterizes the adaptive convolution kernels across diverse regions. This adaptive mechanism generates the convolution kernels (weights and biases) tailored to spatial variations, thereby enabling the network to effectively capture spatially varying information. Building on SACB, we introduce a pyramid flow estimator (named SACB-Net) that integrates SACBs to facilitate multi-scale flow composition, particularly addressing large deformations. Experimental results on the brain IXI and LPBA datasets as well as Abdomen CT datasets demonstrate the effectiveness of SACB and the superiority of SACB-Net over the state-of-the-art learning-based registration methods. The code is available at https://github.com/x-xc/SACB_Net .

Paper Structure

This paper contains 22 sections, 12 equations, 10 figures, 5 tables.

Figures (10)

  • Figure 1: Spatial-awareness Convolution. Since deformation is usually related to tissue shape in medical image registration, voxels/features from different regions should be given varying levels of attention. However, vanilla convolution methods apply shared kernel weights across all regions, leading to a suboptimal estimation of deformation fields. SAC mechanisms, on the other hand, apply distinct kernel attention weights for different spatial clusters.
  • Figure 2: Illustration of a 5-level pyramid SACB-Net. SACB-Net includes a shared encoder that extracts multi-scale feature maps $\{F_m^i\}$ and $\{F_f^i\}$ for the moving image $I_m$ and the fixed image $I_f$, as well as pyramid flow estimators at each scale. At the lowest level, the flow estimator learns deformation ($\varphi_5$) from the extracted moving and fixed image features ($F_m^5$ and $F_f^5$). The following flow estimators take the level-wise features and the output deformation from its preceding level to compose the deformation. Each flow estimator includes a Spatial-Awareness Convolution Block (SACB) to enhance spatially adaptive feature representation, along with a similarity matching module for flow estimation.
  • Figure 3: Architecture of the 3D spatial-awareness convolution block (example in three clusters). This block aims to refine the input feature $\mathbf{F}$ to $\hat{\mathbf{F}}$ by adaptive convolution learned from spatial feature clustering. SACB consists of three parts: 1) the spatial context estimation module employs KMeans to cluster similar spatial features on the patched unfolding features, in which the features belonging to the same cluster centroid $S_n^c$ will be indicated with a cluster index map $S_n$. 2) the adaptive kernel generator leverages each cluster centroid to generate cluster-specific spatial weights and bias via two MLPs, the resulting weights and bias will be used to form the final spatial adaptive convolution kernel, which will be imposed on the cluster-indexed unfolding features, again highlighting the spatial-awareness. 3) the original features and spatial-aware features are residually connected to form the final refined $\hat{\mathbf{F}}$.
  • Figure 4: Visual comparisons on the Brain LPBA dataset (top two rows) and the Abdomen CT dataset (bottom two rows). Columns 3–10 display warped moving images (row 1 and 3), displacement fields as RGB images (row 2) and warped segmentation masks (row 4).
  • Figure 5: Diagram of the shared encoder architecture, featuring five convolutional blocks to extract multi-scale feature maps and four average pooling layers for downsampling.
  • ...and 5 more figures