Table of Contents
Fetching ...

UniReg: Foundation Model for Controllable Medical Image Registration

Zi Li, Jianpeng Zhang, Tai Ma, Tony C. W. Mok, Yan-Jie Zhou, Zeli Chen, Xianghua Ye, Le Lu, Dakai Jin

TL;DR

UniReg introduces a foundation-style, controllable universal registration model that unifies multiple CT registration tasks under a single network by conditioning a dynamic deformation head on anatomical priors, inter-/intra-subject context, and instance features. A SAM-based shared backbone extracts robust anatomical representations, while a lightweight controller generates task-specific deformation kernels to produce a smooth, plausible deformation field for warping moving images toward fixed targets. The approach achieves state-of-the-art or near state-of-the-art accuracy across inter- and intra-subject registrations for 90 organs, with roughly half the training iterations of conventional task-specific models and strong generalization to unseen tasks via minimal conditioning. This yields a flexible, efficient solution for large-scale clinical workflows, enabling rapid adaptation to diverse registration scenarios with reduced computational demands.

Abstract

Learning-based medical image registration has achieved performance parity with conventional methods while demonstrating a substantial advantage in computational efficiency. However, learning-based registration approaches lack generalizability across diverse clinical scenarios, requiring the laborious development of multiple isolated networks for specific registration tasks, e.g., inter-/intra-subject registration or organ-specific alignment. % To overcome this limitation, we propose \textbf{UniReg}, the first interactive foundation model for medical image registration, which combines the precision advantages of task-specific learning methods with the generalization of traditional optimization methods. Our key innovation is a unified framework for diverse registration scenarios, achieved through a conditional deformation field estimation within a unified registration model. This is realized through a dynamic learning paradigm that explicitly encodes: (1) anatomical structure priors, (2) registration type constraints (inter/intra-subject), and (3) instance-specific features, enabling the generation of scenario-optimal deformation fields. % Through comprehensive experiments encompassing $90$ anatomical structures at different body regions, our UniReg model demonstrates comparable performance with contemporary state-of-the-art methodologies while achieving ~50\% reduction in required training iterations relative to the conventional learning-based paradigm. This optimization contributes to a significant reduction in computational resources, such as training time. Code and model will be available.

UniReg: Foundation Model for Controllable Medical Image Registration

TL;DR

UniReg introduces a foundation-style, controllable universal registration model that unifies multiple CT registration tasks under a single network by conditioning a dynamic deformation head on anatomical priors, inter-/intra-subject context, and instance features. A SAM-based shared backbone extracts robust anatomical representations, while a lightweight controller generates task-specific deformation kernels to produce a smooth, plausible deformation field for warping moving images toward fixed targets. The approach achieves state-of-the-art or near state-of-the-art accuracy across inter- and intra-subject registrations for 90 organs, with roughly half the training iterations of conventional task-specific models and strong generalization to unseen tasks via minimal conditioning. This yields a flexible, efficient solution for large-scale clinical workflows, enabling rapid adaptation to diverse registration scenarios with reduced computational demands.

Abstract

Learning-based medical image registration has achieved performance parity with conventional methods while demonstrating a substantial advantage in computational efficiency. However, learning-based registration approaches lack generalizability across diverse clinical scenarios, requiring the laborious development of multiple isolated networks for specific registration tasks, e.g., inter-/intra-subject registration or organ-specific alignment. % To overcome this limitation, we propose \textbf{UniReg}, the first interactive foundation model for medical image registration, which combines the precision advantages of task-specific learning methods with the generalization of traditional optimization methods. Our key innovation is a unified framework for diverse registration scenarios, achieved through a conditional deformation field estimation within a unified registration model. This is realized through a dynamic learning paradigm that explicitly encodes: (1) anatomical structure priors, (2) registration type constraints (inter/intra-subject), and (3) instance-specific features, enabling the generation of scenario-optimal deformation fields. % Through comprehensive experiments encompassing anatomical structures at different body regions, our UniReg model demonstrates comparable performance with contemporary state-of-the-art methodologies while achieving ~50\% reduction in required training iterations relative to the conventional learning-based paradigm. This optimization contributes to a significant reduction in computational resources, such as training time. Code and model will be available.

Paper Structure

This paper contains 24 sections, 4 equations, 5 figures, 8 tables.

Figures (5)

  • Figure 1: (a) Task-specific registration model: Conventional learning-based registration requires manual design of specialized networks for each anatomical region and registration type. This results in redundant development efforts, where $N$ distinct tasks necessitate $N$ training isolated models. (b) Controllable universal registration model: UniReg introduces a conditional control module that dynamically adapts to anatomical regions (circle symbols), registration types (rectangle symbols: inter/intra-patient), and imaging characteristics. Through systematic integration of anatomical priors and knowledge within a unified architecture, the model achieves whole-body organ registration with task-aware deformation field generation.
  • Figure 2: Overview of the proposed universal registration model. Our framework comprises a backbone architecture and Dynamic Registration. Leveraging our dynamic controller mechanism, we can leverage human expertise and prior knowledge of diverse registration tasks throughout the training process. At each training iteration, our framework selects task-specific hyperparameters and segmentation masks corresponding to distinct task IDs, thereby instituting optimization objectives tailored to the individual registration task.
  • Figure 3: Example slices of top two registration methods. The warped anatomical segmentations are overlaid and major registration artifacts are highlighted with blue arrows. SAME$*$ denotes the semi-supervised SAME model. Chest: A.ascendens, A.descendens A.pulmonary, A.vertebral.R, Bronchus. R, Lung. L, Spine, Sternum, Trachea, V.subclavian R. Abdomen: liver and spleen. Multiphase CT: tumor. HeadNeck: Brain, ETbone L, Eye L, Hippocampus L, IAC R, MiddleEar R, OpticNerve R, Pharynx, Thyroid, TympanicCavity L.
  • Figure 4: Network Architecture of the Shared Backbone. Each blue rectangle represents a convolutional layer, with the number of channels indicated within the rectangle and the spatial resolution relative to the input volume displayed below. Following each convolutional layer, the LeakReLU activation function is applied.
  • Figure 5: Examples of three-dimensional rendering volume maps derived from the warped segmentation images of the HeadNeck, Chest, and Abdomen tasks. The highlighted white boxes delineate the regions in which our UniReg attains precise alignment.