Latency-Aware Generative Semantic Communications with Pre-Trained Diffusion Models
Li Qiao, Mahdi Boloursaz Mashhadi, Zhen Gao, Chuan Heng Foh, Pei Xiao, Mehdi Bennis
TL;DR
The paper addresses ultra-low-rate semantic communications by leveraging pre-trained generative foundation models to synthesize signals at the receiver from compressed semantic streams. It proposes a latency-aware GenSemCom framework with multi-modal semantic decomposition, a re-transmission based prompt for reliability, adaptive modulation and coding for conditioning signals, and a latency-aware power allocation under semantic quality constraints. At the receiver, a pre-trained diffusion model generates high fidelity outputs guided by the prompt and conditioning signals, enabling universal applicability without shared knowledge bases. Simulations on image data show ultra-low-rate, low-latency, and channel-adaptive performance, and quantify the trade-offs between latency, power, and semantic quality metrics such as CLIP and MS-SSIM.
Abstract
Generative foundation AI models have recently shown great success in synthesizing natural signals with high perceptual quality using only textual prompts and conditioning signals to guide the generation process. This enables semantic communications at extremely low data rates in future wireless networks. In this paper, we develop a latency-aware semantic communications framework with pre-trained generative models. The transmitter performs multi-modal semantic decomposition on the input signal and transmits each semantic stream with the appropriate coding and communication schemes based on the intent. For the prompt, we adopt a re-transmission-based scheme to ensure reliable transmission, and for the other semantic modalities we use an adaptive modulation/coding scheme to achieve robustness to the changing wireless channel. Furthermore, we design a semantic and latency-aware scheme to allocate transmission power to different semantic modalities based on their importance subjected to semantic quality constraints. At the receiver, a pre-trained generative model synthesizes a high fidelity signal using the received multi-stream semantics. Simulation results demonstrate ultra-low-rate, low-latency, and channel-adaptive semantic communications.
