EidetiCom: A Cross-modal Brain-Computer Semantic Communication Paradigm for Decoding Visual Perception

Linfeng Zheng; Peilin Chen; Shiqi Wang

EidetiCom: A Cross-modal Brain-Computer Semantic Communication Paradigm for Decoding Visual Perception

Linfeng Zheng, Peilin Chen, Shiqi Wang

TL;DR

EidetiCom introduces a three-layer, mission-oriented semantic codec for brain signals that enables cross-modal decoding of visual perception under narrow bandwidth. By hierarchically compressing EEG-derived semantics into object labels (OCL), captions (ICL), and thumbnails (SCL), and using conditional decoding with higher-layer side information, it achieves ultra-low bit rates ($0.017\sim0.192$ bps) while supporting visual classification, captioning, and image generation. The approach demonstrates strong task performance across all three downstream tasks, with competitive perceptual image quality and significant compression gains, suggesting practical potential for eidetic memory storage and assistive BCIs under bandwidth and storage constraints. The results underscore the value of integrating semantic compression, cross-modal decoding, and generative models to enable efficient, real-time brain-computer communication.

Abstract

Brain-computer interface (BCI) facilitates direct communication between the human brain and external systems by utilizing brain signals, eliminating the need for conventional communication methods such as speaking, writing, or typing. Nevertheless, the continuous generation of brain signals in BCI frameworks poses challenges for efficient storage and real-time transmission. While considering the human brain as a semantic source, the meaningful information associated with cognitive activities often gets obscured by substantial noise present in acquired brain signals, resulting in abundant redundancy. In this paper, we propose a cross-modal brain-computer semantic communication paradigm, named EidetiCom, for decoding visual perception under limited-bandwidth constraint. The framework consists of three hierarchical layers, each responsible for compressing the semantic information of brain signals into representative features. These low-dimensional compact features are transmitted and converted into semantically meaningful representations at the receiver side, serving three distinct tasks for decoding visual perception: brain signal-based visual classification, brain-to-caption translation, and brain-to-image generation, in a scalable manner. Through extensive qualitative and quantitative experiments, we demonstrate that the proposed paradigm facilitates the semantic communication under low bit rate conditions ranging from 0.017 to 0.192 bits-per-sample, achieving high-quality semantic reconstruction and highlighting its potential for efficient storage and real-time communication of brain recordings in BCI applications, such as eidetic memory storage and assistive communication for patients.

EidetiCom: A Cross-modal Brain-Computer Semantic Communication Paradigm for Decoding Visual Perception

TL;DR

bps) while supporting visual classification, captioning, and image generation. The approach demonstrates strong task performance across all three downstream tasks, with competitive perceptual image quality and significant compression gains, suggesting practical potential for eidetic memory storage and assistive BCIs under bandwidth and storage constraints. The results underscore the value of integrating semantic compression, cross-modal decoding, and generative models to enable efficient, real-time brain-computer communication.

Abstract

Paper Structure (27 sections, 12 equations, 13 figures, 9 tables)

This paper contains 27 sections, 12 equations, 13 figures, 9 tables.

Introduction
Related Works
Neural Decoding of Brain Signal for BCI
End-to-end Signal Compression
Cross-modal Semantic Communication
Method
The Overall Framework
Object-level Category Layer
Image-level Caption Layer
Stimuli-level Cognition Layer
Experiments
Dataset
Implementation Details
Evaluation Measures
EidetiCom for Brain Signal-based Visual Classification
...and 12 more sections

Figures (13)

Figure 1: Overall framework of EidetiCom, a cross-modal brain-computer semantic communication paradigm for decoding visual perception. The EidetiCom framework encapsulates the entire information processing chain, incorporating the information source, semantic encoder, transmission link, semantic decoder, and information destination. In this semantic communication paradigm, the human brain serves as the source, generating visually-evoked brain signals associated with cognition activities. These signals undergo encoding, transmission, and decoding and are ultimately absorbed by the destination, which decodes visual perception for missions. The EidetiCom framework significantly reduces the storage and transmission costs of brain signals for specific missions, such as eidetic memory or dream storage, which are sensitive to storage needs, and assistive communication for patients, which is sensitive to bandwidth requirements.
Figure 2: The three-layered codec architecture of EidetiCom. EidetiCom consists of three hierarchical layers, each responsible for compressing visually-evoked brain signals into compact semantic features, facilitating effective transmission. These features can be utilized in a scalable manner to decode the visual category, image caption, and generated image associated with the visual stimuli.
Figure 3: The architecture of Object-level Category Layer (OCL). During training, the layer is optimized to achieve the best trade-off between bit rate and the distortion of semantic information for the label. During inference, the compressed features are utilized to identify the category of visually-evoked brain signals based on cosine similarities.
Figure 4: The architecture of Image-level Caption Layer (ICL). During training, the layer is optimized to achieve the best trade-off between the bit rate and the distortion of semantic information for the caption. During inference, the compressed features are translated to image captions for the visual stimuli using the pretrained CLIP text decoder.
Figure 5: The conditional decoder of ICL. The latent feature $\boldsymbol{\hat{y}}_2$ is decoded into a more detailed text-level semantic feature $\boldsymbol{\hat{z}}_2$ with the modulation of category context $\boldsymbol{\hat{z}}_1$.
...and 8 more figures

EidetiCom: A Cross-modal Brain-Computer Semantic Communication Paradigm for Decoding Visual Perception

TL;DR

Abstract

EidetiCom: A Cross-modal Brain-Computer Semantic Communication Paradigm for Decoding Visual Perception

Authors

TL;DR

Abstract

Table of Contents

Figures (13)