Table of Contents
Fetching ...

Cross-modal Medical Image Generation Based on Pyramid Convolutional Attention Network

Fuyou Mao, Lixin Lin, Ming Jiang, Dong Dai, Chao Yang, Hao Zhang, Yan Tang

TL;DR

This work addresses the challenge of missing PET data in Alzheimer's disease diagnostics by proposing PCSA-GAN, a cross-modal framework that translates structural MRI to PET. The generator fuses multi-scale local features via pyramid convolution with channel attention and injects global context through self-attention, guided by a joint loss that combines adversarial, voxel-wise, and MS-SSIM terms. On the ADNI dataset, PCSA-GAN achieves competitive quantitative metrics (MAE, PSNR, SSIM) and enhances AD classification performance when PET is simulated from sMRI, demonstrating both high image fidelity and practical diagnostic value. The approach offers a scalable, noninvasive pathway to augment multimodal imaging and improve early AD assessment in clinical settings.

Abstract

The integration of multimodal medical imaging can provide complementary and comprehensive information for the diagnosis of Alzheimer's disease (AD). However, in clinical practice, since positron emission tomography (PET) is often missing, multimodal images might be incomplete. To address this problem, we propose a method that can efficiently utilize structural magnetic resonance imaging (sMRI) image information to generate high-quality PET images. Our generation model efficiently utilizes pyramid convolution combined with channel attention mechanism to extract multi-scale local features in sMRI, and injects global correlation information into these features using self-attention mechanism to ensure the restoration of the generated PET image on local texture and global structure. Additionally, we introduce additional loss functions to guide the generation model in producing higher-quality PET images. Through experiments conducted on publicly available ADNI databases, the generated images outperform previous research methods in various performance indicators (average absolute error: 0.0194, peak signal-to-noise ratio: 29.65, structural similarity: 0.9486) and are close to real images. In promoting AD diagnosis, the generated images combined with their corresponding sMRI also showed excellent performance in AD diagnosis tasks (classification accuracy: 94.21 %), and outperformed previous research methods of the same type. The experimental results demonstrate that our method outperforms other competing methods in quantitative metrics, qualitative visualization, and evaluation criteria.

Cross-modal Medical Image Generation Based on Pyramid Convolutional Attention Network

TL;DR

This work addresses the challenge of missing PET data in Alzheimer's disease diagnostics by proposing PCSA-GAN, a cross-modal framework that translates structural MRI to PET. The generator fuses multi-scale local features via pyramid convolution with channel attention and injects global context through self-attention, guided by a joint loss that combines adversarial, voxel-wise, and MS-SSIM terms. On the ADNI dataset, PCSA-GAN achieves competitive quantitative metrics (MAE, PSNR, SSIM) and enhances AD classification performance when PET is simulated from sMRI, demonstrating both high image fidelity and practical diagnostic value. The approach offers a scalable, noninvasive pathway to augment multimodal imaging and improve early AD assessment in clinical settings.

Abstract

The integration of multimodal medical imaging can provide complementary and comprehensive information for the diagnosis of Alzheimer's disease (AD). However, in clinical practice, since positron emission tomography (PET) is often missing, multimodal images might be incomplete. To address this problem, we propose a method that can efficiently utilize structural magnetic resonance imaging (sMRI) image information to generate high-quality PET images. Our generation model efficiently utilizes pyramid convolution combined with channel attention mechanism to extract multi-scale local features in sMRI, and injects global correlation information into these features using self-attention mechanism to ensure the restoration of the generated PET image on local texture and global structure. Additionally, we introduce additional loss functions to guide the generation model in producing higher-quality PET images. Through experiments conducted on publicly available ADNI databases, the generated images outperform previous research methods in various performance indicators (average absolute error: 0.0194, peak signal-to-noise ratio: 29.65, structural similarity: 0.9486) and are close to real images. In promoting AD diagnosis, the generated images combined with their corresponding sMRI also showed excellent performance in AD diagnosis tasks (classification accuracy: 94.21 %), and outperformed previous research methods of the same type. The experimental results demonstrate that our method outperforms other competing methods in quantitative metrics, qualitative visualization, and evaluation criteria.

Paper Structure

This paper contains 20 sections, 11 equations, 6 figures, 3 tables.

Figures (6)

  • Figure 1: The overall framework of the model
  • Figure 2: Channel attention mechanism
  • Figure 3: Differences in generated image performances under different methods.
  • Figure 4: Quality comparisons of generated images (Sagittal, Axial, and Coronal planes).
  • Figure 5: Comparison of image quality generated from previous studies.
  • ...and 1 more figures