Table of Contents
Fetching ...

A lightweight residual network for unsupervised deformable image registration

Ahsan Raza Siyal, Astrid Ellen Grams, Markus Haltmeier

TL;DR

This work proposes a novel CNN-based registration method that improves the receptive field, maintains a low parameter count, and delivers strong results even on limited training datasets, using a residual U-Net architecture to expand the receptive field effectively.

Abstract

Accurate volumetric image registration is highly relevant for clinical routines and computer-aided medical diagnosis. Recently, researchers have begun to use transformers in learning-based methods for medical image registration, and have achieved remarkable success. Due to the strong global modeling capability, Transformers are considered a better option than convolutional neural networks (CNNs) for registration. However, they use bulky models with huge parameter sets, which require high computation edge devices for deployment as portable devices or in hospitals. Transformers also need a large amount of training data to produce significant results, and it is often challenging to collect suitable annotated data. Although existing CNN-based image registration can offer rich local information, their global modeling capability is poor for handling long-distance information interaction and limits registration performance. In this work, we propose a CNN-based registration method with an enhanced receptive field, a low number of parameters, and significant results on a limited training dataset. For this, we propose a residual U-Net with embedded parallel dilated-convolutional blocks to enhance the receptive field. The proposed method is evaluated on inter-patient and atlas-based datasets. We show that the performance of the proposed method is comparable and slightly better than transformer-based methods by using only $\SI{1.5}{\percent}$ of its number of parameters.

A lightweight residual network for unsupervised deformable image registration

TL;DR

This work proposes a novel CNN-based registration method that improves the receptive field, maintains a low parameter count, and delivers strong results even on limited training datasets, using a residual U-Net architecture to expand the receptive field effectively.

Abstract

Accurate volumetric image registration is highly relevant for clinical routines and computer-aided medical diagnosis. Recently, researchers have begun to use transformers in learning-based methods for medical image registration, and have achieved remarkable success. Due to the strong global modeling capability, Transformers are considered a better option than convolutional neural networks (CNNs) for registration. However, they use bulky models with huge parameter sets, which require high computation edge devices for deployment as portable devices or in hospitals. Transformers also need a large amount of training data to produce significant results, and it is often challenging to collect suitable annotated data. Although existing CNN-based image registration can offer rich local information, their global modeling capability is poor for handling long-distance information interaction and limits registration performance. In this work, we propose a CNN-based registration method with an enhanced receptive field, a low number of parameters, and significant results on a limited training dataset. For this, we propose a residual U-Net with embedded parallel dilated-convolutional blocks to enhance the receptive field. The proposed method is evaluated on inter-patient and atlas-based datasets. We show that the performance of the proposed method is comparable and slightly better than transformer-based methods by using only of its number of parameters.
Paper Structure (14 sections, 6 equations, 8 figures, 4 tables)

This paper contains 14 sections, 6 equations, 8 figures, 4 tables.

Figures (8)

  • Figure 2.1: Overview of the proposed method. The neural network learns the displacement field $\boldsymbol{\Phi}$ to register the 3D moving image to a 3D fixed image. In the process of training, we warp $\mathbf{M}$ with $\boldsymbol{\Phi}$ using a spatial transformer function. The loss function compares the similarity between $m \circ \boldsymbol{\Phi}$ and encourages smoothness of $\boldsymbol{\Phi}$.
  • Figure 2.2: Top: proposed Res-Unet with dilated convolution blocks. The network takes stacked moving and fixed images and returns the displacement field. Bottom left: Used residual block. Bottom right: Used dilated convolution block.
  • Figure 3.1: Representative example of registered image on the atlas-to-patient dataset.
  • Figure 3.2: Representative example of registered image in the inter-patient dataset.
  • Figure 3.3: Residual images of a sample volume from atlas-to-patient MRI dataset.
  • ...and 3 more figures

Theorems & Definitions (1)

  • Remark 2.1: Image warping