Table of Contents
Fetching ...

Large Generative Model Assisted 3D Semantic Communication

Feibo Jiang, Yubo Peng, Li Dong, Kezhi Wang, Kun Yang, Cunhua Pan, Xiaohu You

TL;DR

The paper tackles 3D semantic communication in 6G by addressing key challenges in 3D semantic extraction, latent redundancy, and channel estimation. It introduces GAM-3DSC, a framework that integrates a 3D Semantic Extractor (3DSE) based on NeRF and SAM, an Adaptive Semantic Compression Model (ASCM) with a dual-head encoder and self-knowledge distillation, and a Generative AI–assisted channel estimation (GDCE) pipeline using CGANs and diffusion models. Through extensive simulations, the authors demonstrate improved semantic extraction accuracy, substantial data reduction in image transmission, and enhanced channel estimation quality, yielding pixel- and semantic-level fidelity in 3D reconstructions (e.g., PSNR around 25 dB, SSIM ~0.95, BLEU ~0.61, cosine similarity ~0.97). The work provides a practical, end-to-end framework for robust 3D data transmission in 6G, with potential impact on AR/MR and other immersive applications by reducing bandwidth while preserving semantic content and improving CSI robustness.

Abstract

Semantic Communication (SC) is a novel paradigm for data transmission in 6G. However, there are several challenges posed when performing SC in 3D scenarios: 1) 3D semantic extraction; 2) Latent semantic redundancy; and 3) Uncertain channel estimation. To address these issues, we propose a Generative AI Model assisted 3D SC (GAM-3DSC) system. Firstly, we introduce a 3D Semantic Extractor (3DSE), which employs generative AI models, including Segment Anything Model (SAM) and Neural Radiance Field (NeRF), to extract key semantics from a 3D scenario based on user requirements. The extracted 3D semantics are represented as multi-perspective images of the goal-oriented 3D object. Then, we present an Adaptive Semantic Compression Model (ASCM) for encoding these multi-perspective images, in which we use a semantic encoder with two output heads to perform semantic encoding and mask redundant semantics in the latent semantic space, respectively. Next, we design a conditional Generative adversarial network and Diffusion model aided-Channel Estimation (GDCE) to estimate and refine the Channel State Information (CSI) of physical channels. Finally, simulation results demonstrate the advantages of the proposed GAM-3DSC system in effectively transmitting the goal-oriented 3D scenario.

Large Generative Model Assisted 3D Semantic Communication

TL;DR

The paper tackles 3D semantic communication in 6G by addressing key challenges in 3D semantic extraction, latent redundancy, and channel estimation. It introduces GAM-3DSC, a framework that integrates a 3D Semantic Extractor (3DSE) based on NeRF and SAM, an Adaptive Semantic Compression Model (ASCM) with a dual-head encoder and self-knowledge distillation, and a Generative AI–assisted channel estimation (GDCE) pipeline using CGANs and diffusion models. Through extensive simulations, the authors demonstrate improved semantic extraction accuracy, substantial data reduction in image transmission, and enhanced channel estimation quality, yielding pixel- and semantic-level fidelity in 3D reconstructions (e.g., PSNR around 25 dB, SSIM ~0.95, BLEU ~0.61, cosine similarity ~0.97). The work provides a practical, end-to-end framework for robust 3D data transmission in 6G, with potential impact on AR/MR and other immersive applications by reducing bandwidth while preserving semantic content and improving CSI robustness.

Abstract

Semantic Communication (SC) is a novel paradigm for data transmission in 6G. However, there are several challenges posed when performing SC in 3D scenarios: 1) 3D semantic extraction; 2) Latent semantic redundancy; and 3) Uncertain channel estimation. To address these issues, we propose a Generative AI Model assisted 3D SC (GAM-3DSC) system. Firstly, we introduce a 3D Semantic Extractor (3DSE), which employs generative AI models, including Segment Anything Model (SAM) and Neural Radiance Field (NeRF), to extract key semantics from a 3D scenario based on user requirements. The extracted 3D semantics are represented as multi-perspective images of the goal-oriented 3D object. Then, we present an Adaptive Semantic Compression Model (ASCM) for encoding these multi-perspective images, in which we use a semantic encoder with two output heads to perform semantic encoding and mask redundant semantics in the latent semantic space, respectively. Next, we design a conditional Generative adversarial network and Diffusion model aided-Channel Estimation (GDCE) to estimate and refine the Channel State Information (CSI) of physical channels. Finally, simulation results demonstrate the advantages of the proposed GAM-3DSC system in effectively transmitting the goal-oriented 3D scenario.
Paper Structure (39 sections, 28 equations, 13 figures, 1 table, 4 algorithms)

This paper contains 39 sections, 28 equations, 13 figures, 1 table, 4 algorithms.

Figures (13)

  • Figure 1: The illustration of the 3D SC between a transmitter and a receiver.
  • Figure 2: The illustration of the proposed GAM-3DSC system.
  • Figure 3: The architecture of 3DSE.
  • Figure 4: The architecture of ASCM.
  • Figure 5: SKD-based model training for ASCM.
  • ...and 8 more figures