I$^3$Net: Inter-Intra-slice Interpolation Network for Medical Slice Synthesis
Haofei Song, Xintian Mao, Jing Yu, Qingli Li, Yan Wang
TL;DR
I$^3$Net tackles the problem of anisotropic medical volumes by focusing on axial slice-wise interpolation, exploiting high in-plane detail while compensating for low through-plane resolution. The model combines an inter-slice branch that enriches through-plane information, an intra-slice branch that learns equalized frequency-band representations via a windowed MLP-Mixer on DCT-transformed features, and a cross-view block that fuses information from axial, coronal, and sagittal views in real time. Key contributions include the introduction of the I$^2$Block architecture and a cross-view mechanism, achieving state-of-the-art PSNR/SSIM on MSD, KiTS19, and IXI, with a reported PSNR of $43.90$ dB on MSD at $\times$2 and faster inference. The approach demonstrates strong generalization across datasets and offers a practical, efficient solution for improving through-plane resolution in medical imaging, with potential clinical impact for lesion detail preservation and downstream analysis.
Abstract
Medical imaging is limited by acquisition time and scanning equipment. CT and MR volumes, reconstructed with thicker slices, are anisotropic with high in-plane resolution and low through-plane resolution. We reveal an intriguing phenomenon that due to the mentioned nature of data, performing slice-wise interpolation from the axial view can yield greater benefits than performing super-resolution from other views. Based on this observation, we propose an Inter-Intra-slice Interpolation Network (I$^3$Net), which fully explores information from high in-plane resolution and compensates for low through-plane resolution. The through-plane branch supplements the limited information contained in low through-plane resolution from high in-plane resolution and enables continual and diverse feature learning. In-plane branch transforms features to the frequency domain and enforces an equal learning opportunity for all frequency bands in a global context learning paradigm. We further propose a cross-view block to take advantage of the information from all three views online. Extensive experiments on two public datasets demonstrate the effectiveness of I$^3$Net, and noticeably outperforms state-of-the-art super-resolution, video frame interpolation and slice interpolation methods by a large margin. We achieve 43.90dB in PSNR, with at least 1.14dB improvement under the upscale factor of $\times$2 on MSD dataset with faster inference. Code is available at https://github.com/DeepMed-Lab-ECNU/Medical-Image-Reconstruction.
