The Wisdom of a Crowd of Brains: A Universal Brain Encoder
Roman Beliy, Navve Wasserman, Amit Zalcher, Michal Irani
TL;DR
The paper tackles the bottleneck of image-to-fMRI encoding being largely subject- and dataset-specific by introducing a Universal Brain-Encoder with voxel-centric voxel embeddings. It jointly trains on diverse subjects/datasets, using a shared image-feature extractor and a cross-attention mechanism that links voxel functionality to multi-scale image features, while learning a 256-dimensional embedding per voxel. This crowd-based approach yields improved encoding performance, enables efficient transfer-learning to new subjects with minimal data, and reveals functionally meaningful voxel clusters that map to shared brain functions without requiring exact anatomical alignment. The work demonstrates practical impact by boosting encoding accuracy across datasets and providing a scalable framework for exploring brain functionality at voxel granularity. Overall, the method significantly extends the utility and applicability of brain-encoding models for neuroscience and potential clinical use.
Abstract
Image-to-fMRI encoding is important for both neuroscience research and practical applications. However, such "Brain-Encoders" have been typically trained per-subject and per fMRI-dataset, thus restricted to very limited training data. In this paper we propose a Universal Brain-Encoder, which can be trained jointly on data from many different subjects/datasets/machines. What makes this possible is our new voxel-centric Encoder architecture, which learns a unique "voxel-embedding" per brain-voxel. Our Encoder trains to predict the response of each brain-voxel on every image, by directly computing the cross-attention between the brain-voxel embedding and multi-level deep image features. This voxel-centric architecture allows the functional role of each brain-voxel to naturally emerge from the voxel-image cross-attention. We show the power of this approach to (i) combine data from multiple different subjects (a "Crowd of Brains") to improve each individual brain-encoding, (ii) quick & effective Transfer-Learning across subjects, datasets, and machines (e.g., 3-Tesla, 7-Tesla), with few training examples, and (iii) use the learned voxel-embeddings as a powerful tool to explore brain functionality (e.g., what is encoded where in the brain).
