Table of Contents
Fetching ...

Multi-Objective Learning for Deformable Image Registration

Monika Grewal, Henrike Westerveld, Peter A. N. Bosman, Tanja Alderliesten

TL;DR

Deformable image registration (DIR) is inherently multi-objective, balancing image similarity, deformation smoothness, and guidance from segmentation or landmarks. The authors present a deep learning MO DIR framework built on VoxelMorph, using a shared encoder to generate $p$ DVFs and training with loss vectors $[L_{ImageSimilarity}, L_{DVFSmoothness}, L_{SegSimilarity}]$ via hypervolume (HV) maximization to approximate a diverse Pareto front. Experiments on cervical MRI with 23 landmark annotations show that MO DIR provides a set of diverse, clinically interpretable registrations that often outperform grid-search in front diversity while achieving comparable TRE and folding metrics. This approach enables a posteriori, patient-specific selection of DIR outputs and offers a more efficient alternative to weight-tuning, with future work aimed at steering the front and enhancing clinical applicability through richer models and visualization.

Abstract

Deformable image registration (DIR) involves optimization of multiple conflicting objectives, however, not many existing DIR algorithms are multi-objective (MO). Further, while there has been progress in the design of deep learning algorithms for DIR, there is no work in the direction of MO DIR using deep learning. In this paper, we fill this gap by combining a recently proposed approach for MO training of neural networks with a well-known deep neural network for DIR and create a deep learning based MO DIR approach. We evaluate the proposed approach for DIR of pelvic magnetic resonance imaging (MRI) scans. We experimentally demonstrate that the proposed MO DIR approach -- providing multiple registration outputs for each patient that each correspond to a different trade-off between the objectives -- has additional desirable properties from a clinical use point-of-view as compared to providing a single DIR output. The experiments also show that the proposed MO DIR approach provides a better spread of DIR outputs across the entire trade-off front than simply training multiple neural networks with weights for each objective sampled from a grid of possible values.

Multi-Objective Learning for Deformable Image Registration

TL;DR

Deformable image registration (DIR) is inherently multi-objective, balancing image similarity, deformation smoothness, and guidance from segmentation or landmarks. The authors present a deep learning MO DIR framework built on VoxelMorph, using a shared encoder to generate DVFs and training with loss vectors via hypervolume (HV) maximization to approximate a diverse Pareto front. Experiments on cervical MRI with 23 landmark annotations show that MO DIR provides a set of diverse, clinically interpretable registrations that often outperform grid-search in front diversity while achieving comparable TRE and folding metrics. This approach enables a posteriori, patient-specific selection of DIR outputs and offers a more efficient alternative to weight-tuning, with future work aimed at steering the front and enhancing clinical applicability through richer models and visualization.

Abstract

Deformable image registration (DIR) involves optimization of multiple conflicting objectives, however, not many existing DIR algorithms are multi-objective (MO). Further, while there has been progress in the design of deep learning algorithms for DIR, there is no work in the direction of MO DIR using deep learning. In this paper, we fill this gap by combining a recently proposed approach for MO training of neural networks with a well-known deep neural network for DIR and create a deep learning based MO DIR approach. We evaluate the proposed approach for DIR of pelvic magnetic resonance imaging (MRI) scans. We experimentally demonstrate that the proposed MO DIR approach -- providing multiple registration outputs for each patient that each correspond to a different trade-off between the objectives -- has additional desirable properties from a clinical use point-of-view as compared to providing a single DIR output. The experiments also show that the proposed MO DIR approach provides a better spread of DIR outputs across the entire trade-off front than simply training multiple neural networks with weights for each objective sampled from a grid of possible values.
Paper Structure (11 sections, 1 equation, 6 figures, 2 tables)

This paper contains 11 sections, 1 equation, 6 figures, 2 tables.

Figures (6)

  • Figure 1: Illustration of the proposed deep learning based MO DIR approach. $I_{source}$: source image, $I_{target}$: target image, $Seg_{source}$ and $Seg_{target}$: organ segmentation masks for source and target image, respectively. The weights of the encoder are shared among $p$ DIR networks, which output $p$ DVFs ($\Delta_1$, $\Delta_2$, ..., $\Delta_p$) to warp $I_{source}$ and $Seg_{source}$. The network is trained to simultaneously minimize $p$ loss vectors $[L_{ImageSimilarity}, L_{DVFSmoothness}, L_{SegSimilarity}]$ using MO learning.
  • Figure 2: (a) Approximation set consisting of 27 solutions, each corresponding to a different trade-off between the 3 loss functions. (b) and (g): A transverse slice from the target and source image, respectively. (c) - (f): Warped images (top row) and DVFs overlaid on the source image (bottom row) corresponding to four solutions (highlighted in color matching with the image frame) in the set. The direction and scale of the arrows represent the displacement vector in the x-y plane, and the color (contrasting for cranial vs. caudal motion) of the arrows represent the displacement along the z-direction. Bladder and rectum contours are shown in cyan and magenta colors, respectively on the images.
  • Figure 3: Approximation sets obtained for two representative test scan pairs in (a) and (b). The colors of the points represent the TRE values in mm (left), and percent folding (right). Lower values represented by blue tones are better. The TRE before DIR is represented by a black line on the TRE colorbar. Black boxes indicate the likely desired regions.
  • Figure 4: Effect of parameter sharing in the Encoder. filled circles: MO DIR without parameter sharing in the encoder, triangles: MO DIR with parameter sharing in the encoder. $p=5$, $n=2$. Approximation sets obtained from 5 models of 5-fold cross-validation are shown.
  • Figure 5: Effect of the location of reference point on the GenMED bosman2011gradients benchmark problem. The Pareto front was approximated using 25 points. The solutions from 10 runs are shown for two different locations of the reference point.
  • ...and 1 more figures