Table of Contents
Fetching ...

A Survey on fMRI-based Brain Decoding for Reconstructing Multimodal Stimuli

Pengyu Liu, Guohua Dong, Dan Guo, Kun Li, Fengling Li, Xun Yang, Meng Wang, Xiaomin Ying

TL;DR

This survey systematically synthesizes fMRI-based brain decoding for reconstructing multimodal stimuli by surveying datasets, brain regions, and model architectures (end-to-end, pretrained generation, encoder-alignment, LLM-centric, and hybrid). It details evaluation metrics and presents qualitative and quantitative results across major datasets (e.g., NSD, BOLD5000, GOD) to establish baselines and compare methods. The work highlights key challenges—such as fMRI’s low temporal resolution and inter-subject variability—and outlines future directions including high-resolution temporal modeling, personalized decoding, and multimodal neural-semantic alignment, with attention to ethical and clinical considerations. Collectively, it provides a structured, forward-looking roadmap for advancing brain decoding research and its translational potential in BCIs and neurotechnology.

Abstract

In daily life, we encounter diverse external stimuli, such as images, sounds, and videos. As research in multimodal stimuli and neuroscience advances, fMRI-based brain decoding has become a key tool for understanding brain perception and its complex cognitive processes. Decoding brain signals to reconstruct stimuli not only reveals intricate neural mechanisms but also drives progress in AI, disease treatment, and brain-computer interfaces. Recent advancements in neuroimaging and image generation models have significantly improved fMRI-based decoding. While fMRI offers high spatial resolution for precise brain activity mapping, its low temporal resolution and signal noise pose challenges. Meanwhile, techniques like GANs, VAEs, and Diffusion Models have enhanced reconstructed image quality, and multimodal pre-trained models have boosted cross-modal decoding tasks. This survey systematically reviews recent progress in fMRI-based brain decoding, focusing on stimulus reconstruction from passive brain signals. It summarizes datasets, relevant brain regions, and categorizes existing methods by model structure. Additionally, it evaluates model performance and discusses their effectiveness. Finally, it identifies key challenges and proposes future research directions, offering valuable insights for the field. For more information and resources related to this survey, visit https://github.com/LpyNow/BrainDecodingImage.

A Survey on fMRI-based Brain Decoding for Reconstructing Multimodal Stimuli

TL;DR

This survey systematically synthesizes fMRI-based brain decoding for reconstructing multimodal stimuli by surveying datasets, brain regions, and model architectures (end-to-end, pretrained generation, encoder-alignment, LLM-centric, and hybrid). It details evaluation metrics and presents qualitative and quantitative results across major datasets (e.g., NSD, BOLD5000, GOD) to establish baselines and compare methods. The work highlights key challenges—such as fMRI’s low temporal resolution and inter-subject variability—and outlines future directions including high-resolution temporal modeling, personalized decoding, and multimodal neural-semantic alignment, with attention to ethical and clinical considerations. Collectively, it provides a structured, forward-looking roadmap for advancing brain decoding research and its translational potential in BCIs and neurotechnology.

Abstract

In daily life, we encounter diverse external stimuli, such as images, sounds, and videos. As research in multimodal stimuli and neuroscience advances, fMRI-based brain decoding has become a key tool for understanding brain perception and its complex cognitive processes. Decoding brain signals to reconstruct stimuli not only reveals intricate neural mechanisms but also drives progress in AI, disease treatment, and brain-computer interfaces. Recent advancements in neuroimaging and image generation models have significantly improved fMRI-based decoding. While fMRI offers high spatial resolution for precise brain activity mapping, its low temporal resolution and signal noise pose challenges. Meanwhile, techniques like GANs, VAEs, and Diffusion Models have enhanced reconstructed image quality, and multimodal pre-trained models have boosted cross-modal decoding tasks. This survey systematically reviews recent progress in fMRI-based brain decoding, focusing on stimulus reconstruction from passive brain signals. It summarizes datasets, relevant brain regions, and categorizes existing methods by model structure. Additionally, it evaluates model performance and discusses their effectiveness. Finally, it identifies key challenges and proposes future research directions, offering valuable insights for the field. For more information and resources related to this survey, visit https://github.com/LpyNow/BrainDecodingImage.

Paper Structure

This paper contains 34 sections, 32 equations, 6 figures, 5 tables.

Figures (6)

  • Figure 1: The basic structure of fMRI-based brain decoding tasks: After the human body receives external stimuli, the brain generates signals in specific regions. By decoding these brain signals and using generative models for reconstruction, the stimulus signals received by the brain are restored.
  • Figure 2: A content overview covered in the survey.
  • Figure 3: We have illustrated the relative positional regions of the 38 subcategories of brain ROIs related to brain decoding, which are mentioned in the Section \ref{['sec:ROI']}. Since ROI regions are irregular polygons and different ROIs may overlap to some extent, we use spheres to visually represent the approximate spatial relationships of the ROIs. The deep red represents the Other Visual and Sensory Areas, red represents the Language and Motion-Related Areas, yellow represents the Multimodal Perception Areas, deep blue represents the Early Visual Cortex, blue represents the Higher Visual Cortex, orange represents the Face Recognition-Related Areas, light blue represents the Motion Visual Areas, and green represents the Auditory-Related Areas. These ROIs are crucial for in-depth research on the stimulus-response relationship in the brain.
  • Figure 4: Overview of Models for fMRI-Based Brain Decoding Tasks.
  • Figure 5: Model Classifications for fMRI-Based Brain Decoding Tasks.
  • ...and 1 more figures