Table of Contents
Fetching ...

Mind-to-Image: Projecting Visual Mental Imagination of the Brain from fMRI

Hugo Caselles-Dupré, Charles Mellerio, Paul Hérent, Alizée Lopez-Persem, Benoit Béranger, Mathieu Soularue, Pierre Fautrel, Gauthier Vernier, Matthieu Cord

TL;DR

This paper introduces Mind-to-Image, a pipeline to reconstruct visual imagery from fMRI data through two imagination modalities: weak imagination (memory-based recall) and strong imagination (pure imagination). A bespoke Surrealism dataset and a 6-hour fMRI collection underpin a modified MindEye-based model adapted to higher-dimensional brain signals, enabling reconstruction via a diffusion-based generator guided by CLIP embeddings. The study demonstrates category-level reconstruction capabilities for imagined portraits and landscapes, and shows transfer-learning from weak to strong imagination with 88% category accuracy, while acknowledging limitations in content fidelity and dataset scale. The work highlights ethical considerations for mind-privacy and outlines future work, including richer datasets, improved evaluation methods, and potential EEG-based alternatives.

Abstract

The reconstruction of images observed by subjects from fMRI data collected during visual stimuli has made strong progress in the past decade, thanks to the availability of extensive fMRI datasets and advancements in generative models for image generation. However, the application of visual reconstruction has remained limited. Reconstructing visual imagination presents a greater challenge, with potentially revolutionary applications ranging from aiding individuals with disabilities to verifying witness accounts in court. The primary hurdles in this field are the absence of data collection protocols for visual imagery and the lack of datasets on the subject. Traditionally, fMRI-to-image relies on data collected from subjects exposed to visual stimuli, which poses issues for generating visual imagery based on the difference of brain activity between visual stimulation and visual imagery. For the first time, we have compiled a substantial dataset (around 6h of scans) on visual imagery along with a proposed data collection protocol. We then train a modified version of an fMRI-to-image model and demonstrate the feasibility of reconstructing images from two modes of imagination: from memory and from pure imagination. The resulting pipeline we call Mind-to-Image marks a step towards creating a technology that allow direct reconstruction of visual imagery.

Mind-to-Image: Projecting Visual Mental Imagination of the Brain from fMRI

TL;DR

This paper introduces Mind-to-Image, a pipeline to reconstruct visual imagery from fMRI data through two imagination modalities: weak imagination (memory-based recall) and strong imagination (pure imagination). A bespoke Surrealism dataset and a 6-hour fMRI collection underpin a modified MindEye-based model adapted to higher-dimensional brain signals, enabling reconstruction via a diffusion-based generator guided by CLIP embeddings. The study demonstrates category-level reconstruction capabilities for imagined portraits and landscapes, and shows transfer-learning from weak to strong imagination with 88% category accuracy, while acknowledging limitations in content fidelity and dataset scale. The work highlights ethical considerations for mind-privacy and outlines future work, including richer datasets, improved evaluation methods, and potential EEG-based alternatives.

Abstract

The reconstruction of images observed by subjects from fMRI data collected during visual stimuli has made strong progress in the past decade, thanks to the availability of extensive fMRI datasets and advancements in generative models for image generation. However, the application of visual reconstruction has remained limited. Reconstructing visual imagination presents a greater challenge, with potentially revolutionary applications ranging from aiding individuals with disabilities to verifying witness accounts in court. The primary hurdles in this field are the absence of data collection protocols for visual imagery and the lack of datasets on the subject. Traditionally, fMRI-to-image relies on data collected from subjects exposed to visual stimuli, which poses issues for generating visual imagery based on the difference of brain activity between visual stimulation and visual imagery. For the first time, we have compiled a substantial dataset (around 6h of scans) on visual imagery along with a proposed data collection protocol. We then train a modified version of an fMRI-to-image model and demonstrate the feasibility of reconstructing images from two modes of imagination: from memory and from pure imagination. The resulting pipeline we call Mind-to-Image marks a step towards creating a technology that allow direct reconstruction of visual imagery.
Paper Structure (23 sections, 5 figures, 1 table)

This paper contains 23 sections, 5 figures, 1 table.

Figures (5)

  • Figure 1: Overview of our Mind-to-Image pipeline which allows to reconstruct images from mental imagery. Left: overview of MindEye, the fMRI-to-Image model used in our Mind-to-Image approach. The stimuli (an image seen or in our case imagined) and the associated fMRI data is fed to a high-level pipeline (composed of an MLP and Diffusion Prior) and a low-level pipeline to create aligned CLIP embeddings of the fMRI data. Middle: fMRI data collection protocols. In the classic fMRI-to-Image approach, a subject sees images in an MRI scanner, BOLD (Blood-Oxygen Level Dependent) data are recorded then used to train an fMRI-to-Image model that converts brain data to match the seen images. In our approach, we devise two mental imagery protocols: weak and strong imagination. For weak imagination, we let the subject imagine previously seen images from a dataset of surrealist images (face portraits and nature landscapes) which we create. The brain data associated with the recollection of such images is gathered. For strong imagination, we collect brain data where the subject completely imagines new images based on instructions. Right: our Mind-to-Image pipeline. We use the weak imagination fMRI data to train an fMRI-to-Image model to reconstruct recollected images from visual imagination brain data. At inference time we use strong imagination fMRI data and this trained model to generate the imagined images, thanks to transfer learning.
  • Figure 2: Functional MRI results of the three conditions, respectively: simple visual perception (A- in red), weak imagination (B- in pink) and strong imagination (C- in blue) presented through a lower (top row) and right lateral (bottom row) view. Activation maps were generated by comparing the “test” and “rest” conditions, using the same statistical threshold for each test (T-score =5).
  • Figure 3: Results for weak imagination on the validation dataset. Top row: images from the validation set of our surrealism image dataset, which were recollected from memory by the subject in the fMRI scanner. Bottom row: reconstructed images from the fMRI-to-Image model based on the brain data associated with the recollection of the images in the top row.
  • Figure 4: Results for strong imagination. Subject is given an instruction on the fMRI screen, such as "Imagine a portrait representing optimism". Then, the subject purely imagines such an image, giving an oral description. Then, the associated brain data is fed to the fMRI-to-Image model to produce a reconstruction, evaluated compared to the oral description.
  • Figure 5: Uncurated results for weak imagination on the validation dataset. In each example, we present an image from the validation set of our surrealism image dataset, which is recollected from memory by the subject in the fMRI scanner, and the reconstructed image from the fMRI-to-Image model based on the brain data associated to the recollection of the original image.