Mind-to-Image: Projecting Visual Mental Imagination of the Brain from fMRI
Hugo Caselles-Dupré, Charles Mellerio, Paul Hérent, Alizée Lopez-Persem, Benoit Béranger, Mathieu Soularue, Pierre Fautrel, Gauthier Vernier, Matthieu Cord
TL;DR
This paper introduces Mind-to-Image, a pipeline to reconstruct visual imagery from fMRI data through two imagination modalities: weak imagination (memory-based recall) and strong imagination (pure imagination). A bespoke Surrealism dataset and a 6-hour fMRI collection underpin a modified MindEye-based model adapted to higher-dimensional brain signals, enabling reconstruction via a diffusion-based generator guided by CLIP embeddings. The study demonstrates category-level reconstruction capabilities for imagined portraits and landscapes, and shows transfer-learning from weak to strong imagination with 88% category accuracy, while acknowledging limitations in content fidelity and dataset scale. The work highlights ethical considerations for mind-privacy and outlines future work, including richer datasets, improved evaluation methods, and potential EEG-based alternatives.
Abstract
The reconstruction of images observed by subjects from fMRI data collected during visual stimuli has made strong progress in the past decade, thanks to the availability of extensive fMRI datasets and advancements in generative models for image generation. However, the application of visual reconstruction has remained limited. Reconstructing visual imagination presents a greater challenge, with potentially revolutionary applications ranging from aiding individuals with disabilities to verifying witness accounts in court. The primary hurdles in this field are the absence of data collection protocols for visual imagery and the lack of datasets on the subject. Traditionally, fMRI-to-image relies on data collected from subjects exposed to visual stimuli, which poses issues for generating visual imagery based on the difference of brain activity between visual stimulation and visual imagery. For the first time, we have compiled a substantial dataset (around 6h of scans) on visual imagery along with a proposed data collection protocol. We then train a modified version of an fMRI-to-image model and demonstrate the feasibility of reconstructing images from two modes of imagination: from memory and from pure imagination. The resulting pipeline we call Mind-to-Image marks a step towards creating a technology that allow direct reconstruction of visual imagery.
