EidetiCom: A Cross-modal Brain-Computer Semantic Communication Paradigm for Decoding Visual Perception
Linfeng Zheng, Peilin Chen, Shiqi Wang
TL;DR
EidetiCom introduces a three-layer, mission-oriented semantic codec for brain signals that enables cross-modal decoding of visual perception under narrow bandwidth. By hierarchically compressing EEG-derived semantics into object labels (OCL), captions (ICL), and thumbnails (SCL), and using conditional decoding with higher-layer side information, it achieves ultra-low bit rates ($0.017\sim0.192$ bps) while supporting visual classification, captioning, and image generation. The approach demonstrates strong task performance across all three downstream tasks, with competitive perceptual image quality and significant compression gains, suggesting practical potential for eidetic memory storage and assistive BCIs under bandwidth and storage constraints. The results underscore the value of integrating semantic compression, cross-modal decoding, and generative models to enable efficient, real-time brain-computer communication.
Abstract
Brain-computer interface (BCI) facilitates direct communication between the human brain and external systems by utilizing brain signals, eliminating the need for conventional communication methods such as speaking, writing, or typing. Nevertheless, the continuous generation of brain signals in BCI frameworks poses challenges for efficient storage and real-time transmission. While considering the human brain as a semantic source, the meaningful information associated with cognitive activities often gets obscured by substantial noise present in acquired brain signals, resulting in abundant redundancy. In this paper, we propose a cross-modal brain-computer semantic communication paradigm, named EidetiCom, for decoding visual perception under limited-bandwidth constraint. The framework consists of three hierarchical layers, each responsible for compressing the semantic information of brain signals into representative features. These low-dimensional compact features are transmitted and converted into semantically meaningful representations at the receiver side, serving three distinct tasks for decoding visual perception: brain signal-based visual classification, brain-to-caption translation, and brain-to-image generation, in a scalable manner. Through extensive qualitative and quantitative experiments, we demonstrate that the proposed paradigm facilitates the semantic communication under low bit rate conditions ranging from 0.017 to 0.192 bits-per-sample, achieving high-quality semantic reconstruction and highlighting its potential for efficient storage and real-time communication of brain recordings in BCI applications, such as eidetic memory storage and assistive communication for patients.
