Table of Contents
Fetching ...

I$^3$Net: Inter-Intra-slice Interpolation Network for Medical Slice Synthesis

Haofei Song, Xintian Mao, Jing Yu, Qingli Li, Yan Wang

TL;DR

I$^3$Net tackles the problem of anisotropic medical volumes by focusing on axial slice-wise interpolation, exploiting high in-plane detail while compensating for low through-plane resolution. The model combines an inter-slice branch that enriches through-plane information, an intra-slice branch that learns equalized frequency-band representations via a windowed MLP-Mixer on DCT-transformed features, and a cross-view block that fuses information from axial, coronal, and sagittal views in real time. Key contributions include the introduction of the I$^2$Block architecture and a cross-view mechanism, achieving state-of-the-art PSNR/SSIM on MSD, KiTS19, and IXI, with a reported PSNR of $43.90$ dB on MSD at $\times$2 and faster inference. The approach demonstrates strong generalization across datasets and offers a practical, efficient solution for improving through-plane resolution in medical imaging, with potential clinical impact for lesion detail preservation and downstream analysis.

Abstract

Medical imaging is limited by acquisition time and scanning equipment. CT and MR volumes, reconstructed with thicker slices, are anisotropic with high in-plane resolution and low through-plane resolution. We reveal an intriguing phenomenon that due to the mentioned nature of data, performing slice-wise interpolation from the axial view can yield greater benefits than performing super-resolution from other views. Based on this observation, we propose an Inter-Intra-slice Interpolation Network (I$^3$Net), which fully explores information from high in-plane resolution and compensates for low through-plane resolution. The through-plane branch supplements the limited information contained in low through-plane resolution from high in-plane resolution and enables continual and diverse feature learning. In-plane branch transforms features to the frequency domain and enforces an equal learning opportunity for all frequency bands in a global context learning paradigm. We further propose a cross-view block to take advantage of the information from all three views online. Extensive experiments on two public datasets demonstrate the effectiveness of I$^3$Net, and noticeably outperforms state-of-the-art super-resolution, video frame interpolation and slice interpolation methods by a large margin. We achieve 43.90dB in PSNR, with at least 1.14dB improvement under the upscale factor of $\times$2 on MSD dataset with faster inference. Code is available at https://github.com/DeepMed-Lab-ECNU/Medical-Image-Reconstruction.

I$^3$Net: Inter-Intra-slice Interpolation Network for Medical Slice Synthesis

TL;DR

INet tackles the problem of anisotropic medical volumes by focusing on axial slice-wise interpolation, exploiting high in-plane detail while compensating for low through-plane resolution. The model combines an inter-slice branch that enriches through-plane information, an intra-slice branch that learns equalized frequency-band representations via a windowed MLP-Mixer on DCT-transformed features, and a cross-view block that fuses information from axial, coronal, and sagittal views in real time. Key contributions include the introduction of the IBlock architecture and a cross-view mechanism, achieving state-of-the-art PSNR/SSIM on MSD, KiTS19, and IXI, with a reported PSNR of dB on MSD at 2 and faster inference. The approach demonstrates strong generalization across datasets and offers a practical, efficient solution for improving through-plane resolution in medical imaging, with potential clinical impact for lesion detail preservation and downstream analysis.

Abstract

Medical imaging is limited by acquisition time and scanning equipment. CT and MR volumes, reconstructed with thicker slices, are anisotropic with high in-plane resolution and low through-plane resolution. We reveal an intriguing phenomenon that due to the mentioned nature of data, performing slice-wise interpolation from the axial view can yield greater benefits than performing super-resolution from other views. Based on this observation, we propose an Inter-Intra-slice Interpolation Network (INet), which fully explores information from high in-plane resolution and compensates for low through-plane resolution. The through-plane branch supplements the limited information contained in low through-plane resolution from high in-plane resolution and enables continual and diverse feature learning. In-plane branch transforms features to the frequency domain and enforces an equal learning opportunity for all frequency bands in a global context learning paradigm. We further propose a cross-view block to take advantage of the information from all three views online. Extensive experiments on two public datasets demonstrate the effectiveness of INet, and noticeably outperforms state-of-the-art super-resolution, video frame interpolation and slice interpolation methods by a large margin. We achieve 43.90dB in PSNR, with at least 1.14dB improvement under the upscale factor of 2 on MSD dataset with faster inference. Code is available at https://github.com/DeepMed-Lab-ECNU/Medical-Image-Reconstruction.
Paper Structure (28 sections, 6 equations, 15 figures, 13 tables)

This paper contains 28 sections, 6 equations, 15 figures, 13 tables.

Figures (15)

  • Figure 1: (a) CTs being anisotropic with thick slices. (b) The comparisons of using typical SR backbones from three views and ours. "coronal", "sagittal" and "axial" mean super-resolution from coronal/sagittal view and interpolation from axial view. "fuse" means fusing the results from multiple views. In contrast, our method directly applies interpolation from the axial view. Evidently, interpolation from the axial view outperforms SR from the other two views. The fusion strategy has a slight improvement, but at a higher cost. We attempt to deal with the medical slice synthesis task via slice-wise interpolation from the axial view.
  • Figure 2: The architecture of I$^3$Net. Our I$^3$Net consists of several I$^2$ Blocks and three Cross-view Blocks.
  • Figure 3: The architecture of I$^2$Block, which consists of an inter-slice branch and an intra-slice branch.
  • Figure 4: The processing of PixelShuffle and PixelUnshuffle within the inter-slice branch.
  • Figure 5: Fusing adjacent slices from spatial domain and spectral domain. The left shows the ground truth of the current slice. The middle is the direct weighted sum of adjacent slices. The right is the weighted sum of the same frequency from adjacent slices.
  • ...and 10 more figures