A Survey on fMRI-based Brain Decoding for Reconstructing Multimodal Stimuli

Pengyu Liu; Guohua Dong; Dan Guo; Kun Li; Fengling Li; Xun Yang; Meng Wang; Xiaomin Ying

A Survey on fMRI-based Brain Decoding for Reconstructing Multimodal Stimuli

Pengyu Liu, Guohua Dong, Dan Guo, Kun Li, Fengling Li, Xun Yang, Meng Wang, Xiaomin Ying

TL;DR

This survey systematically synthesizes fMRI-based brain decoding for reconstructing multimodal stimuli by surveying datasets, brain regions, and model architectures (end-to-end, pretrained generation, encoder-alignment, LLM-centric, and hybrid). It details evaluation metrics and presents qualitative and quantitative results across major datasets (e.g., NSD, BOLD5000, GOD) to establish baselines and compare methods. The work highlights key challenges—such as fMRI’s low temporal resolution and inter-subject variability—and outlines future directions including high-resolution temporal modeling, personalized decoding, and multimodal neural-semantic alignment, with attention to ethical and clinical considerations. Collectively, it provides a structured, forward-looking roadmap for advancing brain decoding research and its translational potential in BCIs and neurotechnology.

Abstract

In daily life, we encounter diverse external stimuli, such as images, sounds, and videos. As research in multimodal stimuli and neuroscience advances, fMRI-based brain decoding has become a key tool for understanding brain perception and its complex cognitive processes. Decoding brain signals to reconstruct stimuli not only reveals intricate neural mechanisms but also drives progress in AI, disease treatment, and brain-computer interfaces. Recent advancements in neuroimaging and image generation models have significantly improved fMRI-based decoding. While fMRI offers high spatial resolution for precise brain activity mapping, its low temporal resolution and signal noise pose challenges. Meanwhile, techniques like GANs, VAEs, and Diffusion Models have enhanced reconstructed image quality, and multimodal pre-trained models have boosted cross-modal decoding tasks. This survey systematically reviews recent progress in fMRI-based brain decoding, focusing on stimulus reconstruction from passive brain signals. It summarizes datasets, relevant brain regions, and categorizes existing methods by model structure. Additionally, it evaluates model performance and discusses their effectiveness. Finally, it identifies key challenges and proposes future research directions, offering valuable insights for the field. For more information and resources related to this survey, visit https://github.com/LpyNow/BrainDecodingImage.

A Survey on fMRI-based Brain Decoding for Reconstructing Multimodal Stimuli

TL;DR

Abstract

A Survey on fMRI-based Brain Decoding for Reconstructing Multimodal Stimuli

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (6)