Table of Contents
Fetching ...

MindSimulator: Exploring Brain Concept Localization via Synthetic FMRI

Guangyin Bao, Qi Zhang, Zixuan Gong, Zhuojia Wu, Duoqian Miao

TL;DR

MindSimulator tackles the challenge of localizing concept-selective regions in the visual cortex under limited and biased real fMRI data by generating synthetic fMRI conditioned on concept-oriented visual stimuli. It introduces a three-component generative encoding model—a fMRI Autoencoder, a Diffusion Estimator with $T$ timesteps, and an Inference Sampler with multi-trial enhancement and correlated noise—to learn conditional fMRI distributions in a latent space aligned with image representations. Trained on the NSD dataset with CLIP-based cross-modal alignment, MindSimulator yields voxel-level and semantic-level encoding performance that surpass baselines and generalizes to out-of-distribution images such as CIFAR datasets, enabling large-scale localization of both known and novel concept-selective regions. This data-driven synthetic approach broadens neuroscience inquiry by providing priors for concept localization and paving the way for an expanding brain concept atlas that complements traditional fLoc methods.

Abstract

Concept-selective regions within the human cerebral cortex exhibit significant activation in response to specific visual stimuli associated with particular concepts. Precisely localizing these regions stands as a crucial long-term goal in neuroscience to grasp essential brain functions and mechanisms. Conventional experiment-driven approaches hinge on manually constructed visual stimulus collections and corresponding brain activity recordings, constraining the support and coverage of concept localization. Additionally, these stimuli often consist of concept objects in unnatural contexts and are potentially biased by subjective preferences, thus prompting concerns about the validity and generalizability of the identified regions. To address these limitations, we propose a data-driven exploration approach. By synthesizing extensive brain activity recordings, we statistically localize various concept-selective regions. Our proposed MindSimulator leverages advanced generative technologies to learn the probability distribution of brain activity conditioned on concept-oriented visual stimuli. This enables the creation of simulated brain recordings that reflect real neural response patterns. Using the synthetic recordings, we successfully localize several well-studied concept-selective regions and validate them against empirical findings, achieving promising prediction accuracy. The feasibility opens avenues for exploring novel concept-selective regions and provides prior hypotheses for future neuroscience research.

MindSimulator: Exploring Brain Concept Localization via Synthetic FMRI

TL;DR

MindSimulator tackles the challenge of localizing concept-selective regions in the visual cortex under limited and biased real fMRI data by generating synthetic fMRI conditioned on concept-oriented visual stimuli. It introduces a three-component generative encoding model—a fMRI Autoencoder, a Diffusion Estimator with timesteps, and an Inference Sampler with multi-trial enhancement and correlated noise—to learn conditional fMRI distributions in a latent space aligned with image representations. Trained on the NSD dataset with CLIP-based cross-modal alignment, MindSimulator yields voxel-level and semantic-level encoding performance that surpass baselines and generalizes to out-of-distribution images such as CIFAR datasets, enabling large-scale localization of both known and novel concept-selective regions. This data-driven synthetic approach broadens neuroscience inquiry by providing priors for concept localization and paving the way for an expanding brain concept atlas that complements traditional fLoc methods.

Abstract

Concept-selective regions within the human cerebral cortex exhibit significant activation in response to specific visual stimuli associated with particular concepts. Precisely localizing these regions stands as a crucial long-term goal in neuroscience to grasp essential brain functions and mechanisms. Conventional experiment-driven approaches hinge on manually constructed visual stimulus collections and corresponding brain activity recordings, constraining the support and coverage of concept localization. Additionally, these stimuli often consist of concept objects in unnatural contexts and are potentially biased by subjective preferences, thus prompting concerns about the validity and generalizability of the identified regions. To address these limitations, we propose a data-driven exploration approach. By synthesizing extensive brain activity recordings, we statistically localize various concept-selective regions. Our proposed MindSimulator leverages advanced generative technologies to learn the probability distribution of brain activity conditioned on concept-oriented visual stimuli. This enables the creation of simulated brain recordings that reflect real neural response patterns. Using the synthetic recordings, we successfully localize several well-studied concept-selective regions and validate them against empirical findings, achieving promising prediction accuracy. The feasibility opens avenues for exploring novel concept-selective regions and provides prior hypotheses for future neuroscience research.

Paper Structure

This paper contains 30 sections, 6 equations, 20 figures, 7 tables.

Figures (20)

  • Figure 1: Overview of the proposed MindSimulator. It comprises a fMRI autoencoder, a Diffusion Estimator, and a Inference Sampler. The fMRI autoencoder enables mutual transformation between voxels and fMRI representations. The diffusion estimator generates fMRI from noise conditioned on images. The inference sampler achieves high-precision fMRI synthesis. Please refer to Sections \ref{['sec: fmri ae']} to \ref{['sec: infer']} for more details.
  • Figure 2: Analogical explanation for limitation of voxel-level metrics. The better low-level performance does not indicate a more accurate synthesis.
  • Figure 3: The proposed semantic-level evaluation pipeline for synthetic fMRI. We use trained visual decoding models to extract the semantics contained in the synthetic fMRI and compare them with the ground truth.
  • Figure 4: Visualization comparison between linear regression encoding and our MindStimulator. GT = seen visual stimuli. Linear = reconstruction from linear encoded fMRI. Ours = reconstruction from our encoding. The fMRI synthesized by our method has more accurate concepts, colors, backgrounds, and number of objects. We show the results of Subj01 and more results can be found in Appendix \ref{['app sec: results']}. Zoom in for better viewing.
  • Figure 5: Comparison between CIFAR-10/100 images (Stimuli) and corresponding reconstructing results from MindSimulator's synthetic fMRI (Ours). The original stimuli are upsampling to 224×224.
  • ...and 15 more figures