Table of Contents
Fetching ...

Pyramid Attention Network for Medical Image Registration

Zhuoyuan Wang, Haiqiao Wang, Yi Wang

TL;DR

This paper tackles deformable medical image registration, focusing on capturing complex spatial relationships under large deformations. It introduces the unsupervised Pyramid Attention Network (PAN), which combines a dual-stream pyramid encoder with channel-wise attention and a multi-head local attention Transformer (LAT) decoder, augmented by orthogonal regularization to learn diverse motion patterns. PAN outperforms CNN-based and Transformer-based baselines on IXI, LPBA40, and AbdomenMRCT across metrics such as DSC, ASSD, and deformation smoothness, while maintaining fast inference (~0.6 s per pair). By integrating multi-scale feature extraction with localized attention, PAN offers a scalable, efficient framework for robust medical image registration with potential clinical impact.

Abstract

The advent of deep-learning-based registration networks has addressed the time-consuming challenge in traditional iterative methods.However, the potential of current registration networks for comprehensively capturing spatial relationships has not been fully explored, leading to inadequate performance in large-deformation image registration.The pure convolutional neural networks (CNNs) neglect feature enhancement, while current Transformer-based networks are susceptible to information redundancy.To alleviate these issues, we propose a pyramid attention network (PAN) for deformable medical image registration.Specifically, the proposed PAN incorporates a dual-stream pyramid encoder with channel-wise attention to boost the feature representation.Moreover, a multi-head local attention Transformer is introduced as decoder to analyze motion patterns and generate deformation fields.Extensive experiments on two public brain magnetic resonance imaging (MRI) datasets and one abdominal MRI dataset demonstrate that our method achieves favorable registration performance, while outperforming several CNN-based and Transformer-based registration networks.Our code is publicly available at https://github.com/JuliusWang-7/PAN.

Pyramid Attention Network for Medical Image Registration

TL;DR

This paper tackles deformable medical image registration, focusing on capturing complex spatial relationships under large deformations. It introduces the unsupervised Pyramid Attention Network (PAN), which combines a dual-stream pyramid encoder with channel-wise attention and a multi-head local attention Transformer (LAT) decoder, augmented by orthogonal regularization to learn diverse motion patterns. PAN outperforms CNN-based and Transformer-based baselines on IXI, LPBA40, and AbdomenMRCT across metrics such as DSC, ASSD, and deformation smoothness, while maintaining fast inference (~0.6 s per pair). By integrating multi-scale feature extraction with localized attention, PAN offers a scalable, efficient framework for robust medical image registration with potential clinical impact.

Abstract

The advent of deep-learning-based registration networks has addressed the time-consuming challenge in traditional iterative methods.However, the potential of current registration networks for comprehensively capturing spatial relationships has not been fully explored, leading to inadequate performance in large-deformation image registration.The pure convolutional neural networks (CNNs) neglect feature enhancement, while current Transformer-based networks are susceptible to information redundancy.To alleviate these issues, we propose a pyramid attention network (PAN) for deformable medical image registration.Specifically, the proposed PAN incorporates a dual-stream pyramid encoder with channel-wise attention to boost the feature representation.Moreover, a multi-head local attention Transformer is introduced as decoder to analyze motion patterns and generate deformation fields.Extensive experiments on two public brain magnetic resonance imaging (MRI) datasets and one abdominal MRI dataset demonstrate that our method achieves favorable registration performance, while outperforming several CNN-based and Transformer-based registration networks.Our code is publicly available at https://github.com/JuliusWang-7/PAN.
Paper Structure (13 sections, 6 equations, 3 figures, 3 tables)

This paper contains 13 sections, 6 equations, 3 figures, 3 tables.

Figures (3)

  • Figure 1: The proposed pyramid attention network (PAN), which consists of the dual-stream pyramid encoder and the local attention Transformer decoder.
  • Figure 2: Illustration of the local attention Transformer (LAT). In this illustration, we take the attention head number $S$=3 for example.
  • Figure 3: The visualized registration results. The colored regions indicate the masks of different anatomical structures. $1^{st}$ row: IXI; $2^{nd}$ row: LPBA40; $3^{rd}$ row: AbdomenMR.