Table of Contents
Fetching ...

Communicate Less, Synthesize the Rest: Latency-aware Intent-based Generative Semantic Multicasting with Diffusion Models

Xinkai Liu, Mahdi Boloursaz Mashhadi, Li Qiao, Yi Ma, Rahim Tafazolli, Mehdi Bennis

TL;DR

The paper tackles latency-constrained, multi-user semantic communications where receivers have heterogeneous intents. It proposes a latency-aware, intent-based generative semantic multicasting framework that decomposes the source into semantic classes, transmits only the intended parts, and distributes a shared semantic map for diffusion-model–based synthesis of non-intended content, all within a source-channel separation architecture. Key contributions include (i) multi-class semantic decomposition with DDRNet-based segmentation, (ii) mixed reconstruction/synthesis using diffusion models with semantic guidance, (iii) per-class adaptive resource allocation and multi-stream synchronization, and (iv) a SQP-based optimization framework to minimize total latency under distortion/perception constraints, supported by offline rate-distortion/perception curves. Simulation results on Cityscapes and COCO-Stuff demonstrate significant per-user latency reductions and competitive perceptual quality compared with non-generative and intent-unaware baselines, with diffusion-based synthesis offering superior visual realism. The work provides a scalable, adaptable approach for future wireless platforms supporting immersive, multi-user AI-driven multimedia services.

Abstract

Generative diffusion models (GDMs) have recently shown great success in synthesizing multimedia signals with high perceptual quality, enabling highly efficient semantic communications in future wireless networks. In this paper, we develop an intent-aware generative semantic multicasting framework utilizing pre-trained diffusion models. In the proposed framework, the transmitter decomposes the source signal into multiple semantic classes based on the multi-user intent, i.e. each user is assumed to be interested in details of only a subset of the semantic classes. To better utilize the wireless resources, the transmitter sends to each user only its intended classes, and multicasts a highly compressed semantic map to all users over shared wireless resources that allows them to locally synthesize the other classes, namely non-intended classes, utilizing pre-trained diffusion models. The signal retrieved at each user is thereby partially reconstructed and partially synthesized utilizing the received semantic map. We design a communication/computation-aware scheme for per-class adaptation of the communication parameters, such as the transmission power and compression rate, to minimize the total latency of retrieving signals at multiple receivers, tailored to the prevailing channel conditions as well as the users' reconstruction/synthesis distortion/perception requirements. The simulation results demonstrate significantly reduced per-user latency compared with non-generative and intent-unaware multicasting benchmarks while maintaining high perceptual quality of the signals retrieved at the users.

Communicate Less, Synthesize the Rest: Latency-aware Intent-based Generative Semantic Multicasting with Diffusion Models

TL;DR

The paper tackles latency-constrained, multi-user semantic communications where receivers have heterogeneous intents. It proposes a latency-aware, intent-based generative semantic multicasting framework that decomposes the source into semantic classes, transmits only the intended parts, and distributes a shared semantic map for diffusion-model–based synthesis of non-intended content, all within a source-channel separation architecture. Key contributions include (i) multi-class semantic decomposition with DDRNet-based segmentation, (ii) mixed reconstruction/synthesis using diffusion models with semantic guidance, (iii) per-class adaptive resource allocation and multi-stream synchronization, and (iv) a SQP-based optimization framework to minimize total latency under distortion/perception constraints, supported by offline rate-distortion/perception curves. Simulation results on Cityscapes and COCO-Stuff demonstrate significant per-user latency reductions and competitive perceptual quality compared with non-generative and intent-unaware baselines, with diffusion-based synthesis offering superior visual realism. The work provides a scalable, adaptable approach for future wireless platforms supporting immersive, multi-user AI-driven multimedia services.

Abstract

Generative diffusion models (GDMs) have recently shown great success in synthesizing multimedia signals with high perceptual quality, enabling highly efficient semantic communications in future wireless networks. In this paper, we develop an intent-aware generative semantic multicasting framework utilizing pre-trained diffusion models. In the proposed framework, the transmitter decomposes the source signal into multiple semantic classes based on the multi-user intent, i.e. each user is assumed to be interested in details of only a subset of the semantic classes. To better utilize the wireless resources, the transmitter sends to each user only its intended classes, and multicasts a highly compressed semantic map to all users over shared wireless resources that allows them to locally synthesize the other classes, namely non-intended classes, utilizing pre-trained diffusion models. The signal retrieved at each user is thereby partially reconstructed and partially synthesized utilizing the received semantic map. We design a communication/computation-aware scheme for per-class adaptation of the communication parameters, such as the transmission power and compression rate, to minimize the total latency of retrieving signals at multiple receivers, tailored to the prevailing channel conditions as well as the users' reconstruction/synthesis distortion/perception requirements. The simulation results demonstrate significantly reduced per-user latency compared with non-generative and intent-unaware multicasting benchmarks while maintaining high perceptual quality of the signals retrieved at the users.

Paper Structure

This paper contains 23 sections, 11 equations, 19 figures, 4 tables, 1 algorithm.

Figures (19)

  • Figure 1: Proposed Framework for Diffusion-based Generative Semantic Multicasting with Intent-aware Semantic Decomposition.
  • Figure 2: Intent-aware semantic Decomposition with DDRNets for generative multicasting.
  • Figure 3: Separate source-channel coding workflow for generative semantic multicasting.
  • Figure 4: Reconstruction/Synthesis Distortion/Perception curves.
  • Figure 5: Per-user latency comparison between the proposed generative semantic multicasting framework and the benchmarks, for (a) $K=5$, (b) $K=10$, (c) $K=15$ (d) $K=20$.
  • ...and 14 more figures