Table of Contents
Fetching ...

Generative AI Meets 6G and Beyond: Diffusion Models for Semantic Communications

Hai-Long Qin, Jincheng Dai, Guo Lu, Shuo Shao, Sixian Wang, Tongda Xu, Wenjun Zhang, Ping Zhang, Khaled B. Letaief

TL;DR

This work addresses the shift from bit-perfect to meaning-preserving wireless communication by advocating diffusion models as foundational priors for semantic reconstruction in 6G+. It presents a comprehensive, tutorial-style synthesis of score-based diffusion theory, conditioning mechanisms (inference-time, training-time, and classifier-free guidance), and efficient/generalized diffusion techniques, all cast within an inverse-problem framework for semantic decoding. The authors articulate practical implications across human-, machine-, and agent-centric scenarios, including generative compression, task-specific perception, and multi-agent coordination, while outlining open issues and future directions. By tying diffusion priors to semantic decoding under uncertainty, the work highlights the potential for ultra-compressed, robust, and semantically faithful wireless transmission in future networks, along with pathways to theoretical and practical standardization.

Abstract

Semantic communications mark a paradigm shift from bit-accurate transmission toward meaning-centric communication, essential as wireless systems approach theoretical capacity limits. The emergence of generative AI has catalyzed generative semantic communications, where receivers reconstruct content from minimal semantic cues by leveraging learned priors. Among generative approaches, diffusion models stand out for their superior generation quality, stable training dynamics, and rigorous theoretical foundations. However, the field currently lacks systematic guidance connecting diffusion techniques to communication system design, forcing researchers to navigate disparate literatures. This article provides the first comprehensive tutorial on diffusion models for generative semantic communications. We present score-based diffusion foundations and systematically review three technical pillars: conditional diffusion for controllable generation, efficient diffusion for accelerated inference, and generalized diffusion for cross-domain adaptation. In addition, we introduce an inverse problem perspective that reformulates semantic decoding as posterior inference, bridging semantic communications with computational imaging. Through analysis of human-centric, machine-centric, and agent-centric scenarios, we illustrate how diffusion models enable extreme compression while maintaining semantic fidelity and robustness. By bridging generative AI innovations with communication system design, this article aims to establish diffusion models as foundational components of next-generation wireless networks and beyond.

Generative AI Meets 6G and Beyond: Diffusion Models for Semantic Communications

TL;DR

This work addresses the shift from bit-perfect to meaning-preserving wireless communication by advocating diffusion models as foundational priors for semantic reconstruction in 6G+. It presents a comprehensive, tutorial-style synthesis of score-based diffusion theory, conditioning mechanisms (inference-time, training-time, and classifier-free guidance), and efficient/generalized diffusion techniques, all cast within an inverse-problem framework for semantic decoding. The authors articulate practical implications across human-, machine-, and agent-centric scenarios, including generative compression, task-specific perception, and multi-agent coordination, while outlining open issues and future directions. By tying diffusion priors to semantic decoding under uncertainty, the work highlights the potential for ultra-compressed, robust, and semantically faithful wireless transmission in future networks, along with pathways to theoretical and practical standardization.

Abstract

Semantic communications mark a paradigm shift from bit-accurate transmission toward meaning-centric communication, essential as wireless systems approach theoretical capacity limits. The emergence of generative AI has catalyzed generative semantic communications, where receivers reconstruct content from minimal semantic cues by leveraging learned priors. Among generative approaches, diffusion models stand out for their superior generation quality, stable training dynamics, and rigorous theoretical foundations. However, the field currently lacks systematic guidance connecting diffusion techniques to communication system design, forcing researchers to navigate disparate literatures. This article provides the first comprehensive tutorial on diffusion models for generative semantic communications. We present score-based diffusion foundations and systematically review three technical pillars: conditional diffusion for controllable generation, efficient diffusion for accelerated inference, and generalized diffusion for cross-domain adaptation. In addition, we introduce an inverse problem perspective that reformulates semantic decoding as posterior inference, bridging semantic communications with computational imaging. Through analysis of human-centric, machine-centric, and agent-centric scenarios, we illustrate how diffusion models enable extreme compression while maintaining semantic fidelity and robustness. By bridging generative AI innovations with communication system design, this article aims to establish diffusion models as foundational components of next-generation wireless networks and beyond.

Paper Structure

This paper contains 101 sections, 52 equations, 14 figures, 5 tables.

Figures (14)

  • Figure 1: Schematic diagram of Weaver's three-level communication model. Level I focuses on accurate bit transmission, Level II on conveyance of meaning, and Level III on achieving desired outcomes.
  • Figure 2: Statistical results manifesting the rapid development of diffusion models in recent years. The number of published papers by searching "diffusion models" in Web of Science is accessed by July 31, 2025. For 2025, the blue bar indicates the number of papers collected up to and including July 2025, and the dashed gray bar indicates the projected number for the whole year.
  • Figure 3: Overview of the article organization. Each colored box represents a major section of the article.
  • Figure 4: Comparison between discriminative and generative modeling in machine learning. Discriminative models directly learn the mapping from inputs to outputs, while generative models learn the underlying data distribution enabling synthesis.
  • Figure 5: Score-based modeling pipeline for diffusion models. (a) Score matching: The model learns to approximate the score (gradient of log-density, indicated by arrows) of the data distribution through techniques such as denoising score matching. (b) Stochastic sampling: Langevin dynamics generates samples by following the learned score function with stochastic perturbations, where annealing refers to gradually decreasing noise levels during the sampling process.
  • ...and 9 more figures