Reconstructing Retinal Visual Images from 3T fMRI Data Enhanced by Unsupervised Learning
Yujian Xiong, Wenhui Zhu, Zhong-Lin Lu, Yalin Wang
TL;DR
This work tackles cross-subject visual image reconstruction from fMRI under data limitations by generating enhanced 3T fMRI data via an unsupervised OT-GAN trained on unpaired NSD (7T) and NOD (3T) data. The enhanced 3T representations are linked to latent visual and semantic spaces through linear mappings and then used with a pre-trained Stable Diffusion model to reconstruct images, achieving superior quality compared to single-subject baselines. The method yields a lower FID (40.48) on 70 shared images than previous approaches and demonstrates reconstruction for an unseen subject with brief 3T scans, highlighting practical applicability. This approach enables scalable cross-subject decoding with realistic data constraints and points to extensions in pRF mapping and clinical diagnostics.
Abstract
The reconstruction of human visual inputs from brain activity, particularly through functional Magnetic Resonance Imaging (fMRI), holds promising avenues for unraveling the mechanisms of the human visual system. Despite the significant strides made by deep learning methods in improving the quality and interpretability of visual reconstruction, there remains a substantial demand for high-quality, long-duration, subject-specific 7-Tesla fMRI experiments. The challenge arises in integrating diverse smaller 3-Tesla datasets or accommodating new subjects with brief and low-quality fMRI scans. In response to these constraints, we propose a novel framework that generates enhanced 3T fMRI data through an unsupervised Generative Adversarial Network (GAN), leveraging unpaired training across two distinct fMRI datasets in 7T and 3T, respectively. This approach aims to overcome the limitations of the scarcity of high-quality 7-Tesla data and the challenges associated with brief and low-quality scans in 3-Tesla experiments. In this paper, we demonstrate the reconstruction capabilities of the enhanced 3T fMRI data, highlighting its proficiency in generating superior input visual images compared to data-intensive methods trained and tested on a single subject.
