Table of Contents
Fetching ...

On the Degrees of Freedom of Gridded Control Points in Learning-Based Medical Image Registration

Wen Yan, Qianye Yang, Yipei Wang, Shonit Punwani, Mark Emberton, Vasilis Stavrinides, Yipeng Hu, Dean Barratt

Abstract

Many registration problems are ill-posed in homogeneous or noisy regions, and dense voxel-wise decoders can be unnecessarily high-dimensional. A sparse control-point parameterisation provides a compact, smooth deformation representation while reducing memory and improving stability. This work investigates the required control points for learning-based registration network development. We present GridReg, a learning-based registration framework that replaces dense voxel-wise decoding with displacement predictions at a sparse grid of control points. This design substantially cuts the parameter count and memory while retaining registration accuracy. Multiscale 3D encoder feature maps are flattened into a 1D token sequence with positional encoding to retain spatial context. The model then predicts a sparse gridded deformation field using a cross-attention module. We further introduce grid-adaptive training, enabling an adaptive model to operate at multiple grid sizes at inference without retraining. This work quantitatively demonstrates the benefits of using sparse grids. Using three data sets for registering prostate gland, pelvic organs and neurological structures, the results suggested a significant improvement with the usage of grid-controled displacement field. Alternatively, the superior registration performance was obtained using the proposed approach, with a similar or less computational cost, compared with existing algorithms that predict DDFs or displacements sampled on scattered key points.

On the Degrees of Freedom of Gridded Control Points in Learning-Based Medical Image Registration

Abstract

Many registration problems are ill-posed in homogeneous or noisy regions, and dense voxel-wise decoders can be unnecessarily high-dimensional. A sparse control-point parameterisation provides a compact, smooth deformation representation while reducing memory and improving stability. This work investigates the required control points for learning-based registration network development. We present GridReg, a learning-based registration framework that replaces dense voxel-wise decoding with displacement predictions at a sparse grid of control points. This design substantially cuts the parameter count and memory while retaining registration accuracy. Multiscale 3D encoder feature maps are flattened into a 1D token sequence with positional encoding to retain spatial context. The model then predicts a sparse gridded deformation field using a cross-attention module. We further introduce grid-adaptive training, enabling an adaptive model to operate at multiple grid sizes at inference without retraining. This work quantitatively demonstrates the benefits of using sparse grids. Using three data sets for registering prostate gland, pelvic organs and neurological structures, the results suggested a significant improvement with the usage of grid-controled displacement field. Alternatively, the superior registration performance was obtained using the proposed approach, with a similar or less computational cost, compared with existing algorithms that predict DDFs or displacements sampled on scattered key points.
Paper Structure (24 sections, 14 equations, 8 figures, 4 tables)

This paper contains 24 sections, 14 equations, 8 figures, 4 tables.

Figures (8)

  • Figure 1: Demonstration of an illustrative example of false-positive correspondence (orange stars), possibly caused by similar intensity patterns (e.g., unsupervised loss similarity) or label errors (e.g., segmentation noise). A sparse low-resolution grid helps filter out such a noise, as stronger "true" correspondences (four corners of the shaded ROIs) dominate. In contrast, a finer high-resolution grid may amplify the noise, distorting additional control points within the ROIs and leading to inaccurate registration.
  • Figure 2: Illustration of GridReg. The encoder extracts multi-scale 3D feature maps; at each scale, a skip-projection flattens features into 1D tokens that are fused with grid-cell queries via (local) attention. The resulting features are mapped to a sparse control-point displacement field via Bayesian integration, and a dense displacement field (DDF) is obtained by interpolation (e.g., trilinear, B-spline, or transposed convolution).
  • Figure 3: Examples of anatomical landmarks annotated for the same patient at two different time points. The first and second rows show 5 landmarks at the first and second time points, respectively.
  • Figure 4: Examples of 7 landmarks labeled by experts in brain dataset.
  • Figure 5: Visualisation of prostate registration. The first column and the second column show four examples of the fixed and moving images, respectively. The next four columns show the registration results from (A)GridReg, (B) VoxelMorph, (C) KeyMorph, (D)TransMorph and (E) ANTs SyN, respectively.
  • ...and 3 more figures