Multimodal Brain-Computer Interfaces: AI-powered Decoding Methodologies
Siyang Li, Hongbin Wang, Xiaoqing Chen, Dongrui Wu
TL;DR
The paper surveys AI-powered decoding methodologies for multimodal BCIs, addressing how cross-modality mapping, sequential modeling, and multimodal fusion can improve brain data interpretation across visual, speech, and affective domains. It catalogs algorithmic approaches including cross-modality contrastive learning, generative modeling, and Transformer-based fusion, and discusses how these methods enable mappings, translations, and coherent sequencing between brain signals and external modalities. It also analyzes brain data types, acquisition methods, datasets, and practical challenges such as data heterogeneity, big data requirements, and security/privacy concerns, offering a pathway toward brain foundation models. The study highlights potential societal benefits in healthcare, rehabilitation, and brain-computer interfacing while underscoring remaining obstacles to real-world deployment. Overall, it argues that large-scale, aligned multimodal brain data and foundation-model–driven AI will be pivotal for robust, scalable AI-powered BCIs.
Abstract
Brain-computer interfaces (BCIs) enable direct communication between the brain and external devices. This review highlights the core decoding algorithms that enable multimodal BCIs, including a dissection of the elements, a unified view of diversified approaches, and a comprehensive analysis of the present state of the field. We emphasize algorithmic advancements in cross-modality mapping, sequential modeling, besides classic multi-modality fusion, illustrating how these novel AI approaches enhance decoding of brain data. The current literature of BCI applications on visual, speech, and affective decoding are comprehensively explored. Looking forward, we draw attention on the impact of emerging architectures like multimodal Transformers, and discuss challenges such as brain data heterogeneity and common errors. This review also serves as a bridge in this interdisciplinary field for experts with neuroscience background and experts that study AI, aiming to provide a comprehensive understanding for AI-powered multimodal BCIs.
