SIMAC: A Semantic-Driven Integrated Multimodal Sensing And Communication Framework

Yubo Peng; Luping Xiang; Kun Yang; Feibo Jiang; Kezhi Wang; Dapeng Oliver Wu

SIMAC: A Semantic-Driven Integrated Multimodal Sensing And Communication Framework

Yubo Peng, Luping Xiang, Kun Yang, Feibo Jiang, Kezhi Wang, Dapeng Oliver Wu

TL;DR

This work tackles the limitations of single-modality sensing and decoupled sensing-communication by proposing SIMAC, a semantic-driven framework for integrated multimodal sensing and communication. It combines a Multimodal Semantic Fusion network, an LLM-based semantic encoder, and a task-oriented semantic decoder to jointly sense and transmit meaning, rather than raw data, over wireless channels. A multi-task learning objective enables diversified sensing services (distance, angle, velocity, and image reconstruction) while maintaining low communication overhead. Experimental results on VIRAT-derived data demonstrate improved sensing accuracy and robust multimodal reconstruction across varying SNRs, highlighting SIMAC's potential for real-time, bandwidth-constrained sensing applications.

Abstract

Traditional single-modality sensing faces limitations in accuracy and capability, and its decoupled implementation with communication systems increases latency in bandwidth-constrained environments. Additionally, single-task-oriented sensing systems fail to address users' diverse demands. To overcome these challenges, we propose a semantic-driven integrated multimodal sensing and communication (SIMAC) framework. This framework leverages a joint source-channel coding architecture to achieve simultaneous sensing decoding and transmission of sensing results. Specifically, SIMAC first introduces a multimodal semantic fusion (MSF) network, which employs two extractors to extract semantic information from radar signals and images, respectively. MSF then applies cross-attention mechanisms to fuse these unimodal features and generate multimodal semantic representations. Secondly, we present a large language model (LLM)-based semantic encoder (LSE), where relevant communication parameters and multimodal semantics are mapped into a unified latent space and input to the LLM, enabling channel-adaptive semantic encoding. Thirdly, a task-oriented sensing semantic decoder (SSD) is proposed, in which different decoded heads are designed according to the specific needs of tasks. Simultaneously, a multi-task learning strategy is introduced to train the SIMAC framework, achieving diverse sensing services. Finally, experimental simulations demonstrate that the proposed framework achieves diverse sensing services and higher accuracy.

SIMAC: A Semantic-Driven Integrated Multimodal Sensing And Communication Framework

TL;DR

Abstract

SIMAC: A Semantic-Driven Integrated Multimodal Sensing And Communication Framework

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (13)