Table of Contents
Fetching ...

Reconstructing Retinal Visual Images from 3T fMRI Data Enhanced by Unsupervised Learning

Yujian Xiong, Wenhui Zhu, Zhong-Lin Lu, Yalin Wang

TL;DR

This work tackles cross-subject visual image reconstruction from fMRI under data limitations by generating enhanced 3T fMRI data via an unsupervised OT-GAN trained on unpaired NSD (7T) and NOD (3T) data. The enhanced 3T representations are linked to latent visual and semantic spaces through linear mappings and then used with a pre-trained Stable Diffusion model to reconstruct images, achieving superior quality compared to single-subject baselines. The method yields a lower FID (40.48) on 70 shared images than previous approaches and demonstrates reconstruction for an unseen subject with brief 3T scans, highlighting practical applicability. This approach enables scalable cross-subject decoding with realistic data constraints and points to extensions in pRF mapping and clinical diagnostics.

Abstract

The reconstruction of human visual inputs from brain activity, particularly through functional Magnetic Resonance Imaging (fMRI), holds promising avenues for unraveling the mechanisms of the human visual system. Despite the significant strides made by deep learning methods in improving the quality and interpretability of visual reconstruction, there remains a substantial demand for high-quality, long-duration, subject-specific 7-Tesla fMRI experiments. The challenge arises in integrating diverse smaller 3-Tesla datasets or accommodating new subjects with brief and low-quality fMRI scans. In response to these constraints, we propose a novel framework that generates enhanced 3T fMRI data through an unsupervised Generative Adversarial Network (GAN), leveraging unpaired training across two distinct fMRI datasets in 7T and 3T, respectively. This approach aims to overcome the limitations of the scarcity of high-quality 7-Tesla data and the challenges associated with brief and low-quality scans in 3-Tesla experiments. In this paper, we demonstrate the reconstruction capabilities of the enhanced 3T fMRI data, highlighting its proficiency in generating superior input visual images compared to data-intensive methods trained and tested on a single subject.

Reconstructing Retinal Visual Images from 3T fMRI Data Enhanced by Unsupervised Learning

TL;DR

This work tackles cross-subject visual image reconstruction from fMRI under data limitations by generating enhanced 3T fMRI data via an unsupervised OT-GAN trained on unpaired NSD (7T) and NOD (3T) data. The enhanced 3T representations are linked to latent visual and semantic spaces through linear mappings and then used with a pre-trained Stable Diffusion model to reconstruct images, achieving superior quality compared to single-subject baselines. The method yields a lower FID (40.48) on 70 shared images than previous approaches and demonstrates reconstruction for an unseen subject with brief 3T scans, highlighting practical applicability. This approach enables scalable cross-subject decoding with realistic data constraints and points to extensions in pRF mapping and clinical diagnostics.

Abstract

The reconstruction of human visual inputs from brain activity, particularly through functional Magnetic Resonance Imaging (fMRI), holds promising avenues for unraveling the mechanisms of the human visual system. Despite the significant strides made by deep learning methods in improving the quality and interpretability of visual reconstruction, there remains a substantial demand for high-quality, long-duration, subject-specific 7-Tesla fMRI experiments. The challenge arises in integrating diverse smaller 3-Tesla datasets or accommodating new subjects with brief and low-quality fMRI scans. In response to these constraints, we propose a novel framework that generates enhanced 3T fMRI data through an unsupervised Generative Adversarial Network (GAN), leveraging unpaired training across two distinct fMRI datasets in 7T and 3T, respectively. This approach aims to overcome the limitations of the scarcity of high-quality 7-Tesla data and the challenges associated with brief and low-quality scans in 3-Tesla experiments. In this paper, we demonstrate the reconstruction capabilities of the enhanced 3T fMRI data, highlighting its proficiency in generating superior input visual images compared to data-intensive methods trained and tested on a single subject.
Paper Structure (11 sections, 4 equations, 2 figures, 2 tables)

This paper contains 11 sections, 4 equations, 2 figures, 2 tables.

Figures (2)

  • Figure 1: Illustration of proposed framework: (a) overview of the entire pipeline, red arrows indicate models requiring training, (b) overview of the proposed OT-GAN structure.
  • Figure 2: Ground truth and corresponding reconstruction from the 9th subject of the NOD dataset.