NeuroBind: Towards Unified Multimodal Representations for Neural Signals
Fengyu Yang, Chao Feng, Daniel Wang, Tianye Wang, Ziyao Zeng, Zhiyang Xu, Hyoungseob Park, Pengliang Ji, Hanbin Zhao, Yuanning Li, Alex Wong
TL;DR
NeuroBind addresses the fragmentation of neural-signal analysis by unifying EEG, fMRI, calcium imaging, and spiking data into a shared embedding aligned with pre-trained vision-language models. It achieves this alignment using a frozen image encoder and neural encoders trained with a symmetric InfoNCE objective, enabling zero-shot and cross-modal tasks without retraining the visual component. The method enables cross-modal retrieval, zero-shot classification, zero-shot image reconstruction, and integration with Neuro-LLM, demonstrated across four diverse datasets with notable performance gains over baselines. This unified representation has the potential to advance neuroscience research, facilitate neuroprosthetics, and enhance AI systems by leveraging high-resource modalities. Overall, NeuroBind provides a scalable, modality-agnostic framework for interpreting and leveraging complex brain signals through vision-language priors.
Abstract
Understanding neural activity and information representation is crucial for advancing knowledge of brain function and cognition. Neural activity, measured through techniques like electrophysiology and neuroimaging, reflects various aspects of information processing. Recent advances in deep neural networks offer new approaches to analyzing these signals using pre-trained models. However, challenges arise due to discrepancies between different neural signal modalities and the limited scale of high-quality neural data. To address these challenges, we present NeuroBind, a general representation that unifies multiple brain signal types, including EEG, fMRI, calcium imaging, and spiking data. To achieve this, we align neural signals in these image-paired neural datasets to pre-trained vision-language embeddings. Neurobind is the first model that studies different neural modalities interconnectedly and is able to leverage high-resource modality models for various neuroscience tasks. We also showed that by combining information from different neural signal modalities, NeuroBind enhances downstream performance, demonstrating the effectiveness of the complementary strengths of different neural modalities. As a result, we can leverage multiple types of neural signals mapped to the same space to improve downstream tasks, and demonstrate the complementary strengths of different neural modalities. This approach holds significant potential for advancing neuroscience research, improving AI systems, and developing neuroprosthetics and brain-computer interfaces.
