A Survey of fMRI to Image Reconstruction
Weiyu Guo, Guoying Sun, JianXiang He, Tong Shao, Shaoguang Wang, Ziyang Chen, Meisheng Hong, Ying Sun, Hui Xiong
TL;DR
The paper surveys the nascent field of reconstructing visual content from fMRI signals (fMRI2Image), framing it as a three-stage pipeline: fMRI signal encoding, feature mapping, and image reconstruction. It reviews public datasets (static and dynamic), outlines a taxonomy emphasizing encoding architectures and pretraining, and highlights diffusion-based generators as the dominant reconstruction approach, with multi-modal alignment (notably CLIP) guiding feature mapping. Key contributions include a structured taxonomy, a synthesis of dataset evolution, and a discussion of optimization objectives spanning reconstruction quality and cross-subject generalization, along with practical future directions such as few-shot learning and subject-independent models. The work provides a reference point for researchers by cataloging methods, datasets, and trends, underscoring the potential impact on neuroscience and brain-computer interfaces while candidly addressing current limitations in data, variability, and interpretability.
Abstract
Functional magnetic resonance imaging (fMRI) based image reconstruction plays a pivotal role in decoding human perception, with applications in neuroscience and brain-computer interfaces. While recent advancements in deep learning and large-scale datasets have driven progress, challenges such as data scarcity, cross-subject variability, and low semantic consistency persist. To address these issues, we introduce the concept of fMRI-to-Image Learning (fMRI2Image) and present the first systematic review in this field. This review highlights key challenges, categorizes methodologies such as fMRI signal encoding, feature mapping, and image generator. Finally, promising research directions are proposed to advance this emerging frontier, providing a reference for future studies.
