Table of Contents
Fetching ...

Distributionally Robust Wireless Semantic Communication with Large AI Models

Long Tan Le, Senura Hansaja Wanasekara, Zerun Niu, Nguyen H. Tran, Phuong Vo, Walid Saad, Dusit Niyato, Zhu Han, Choong Seon Hong, H. Vincent Poor

TL;DR

This work addresses robustness in semantic communication for wireless networks by introducing WaSeCom, a bilevel framework based on WDRO to hedge both semantic interpretation errors and channel distortions. It leverages dual WDRO formulations and log-sum-exp smoothing to enable scalable training of large AI backbones (e.g., ViT, BERT) for text and image modalities, with two independent Wasserstein ambiguity sets for semantic inputs and channel outputs. Theoretical results establish generalization guarantees and convergence to robust stationary points, while empirical results on CIFAR-10 and Europarl demonstrate improved robustness under semantic perturbations and wireless channel variability without sacrificing nominal performance. The approach offers a model-agnostic, end-to-end solution that enhances semantic fidelity under adverse wireless conditions, with practical guidelines for hyperparameter tuning and potential extensions to time-varying channels.

Abstract

Semantic communication (SemCom) has emerged as a promising paradigm for 6G wireless systems by transmitting task-relevant information rather than raw bits, yet existing approaches remain vulnerable to dual sources of uncertainty: semantic misinterpretation arising from imperfect feature extraction and transmission-level perturbations from channel noise. Current deep learning based SemCom systems typically employ domain-specific architectures that lack robustness guarantees and fail to generalize across diverse noise conditions, adversarial attacks, and out-of-distribution data. In this paper, a novel and generalized semantic communication framework called WaSeCom is proposed to systematically address uncertainty and enhance robustness. In particular, Wasserstein distributionally robust optimization is employed to provide resilience against semantic misinterpretation and channel perturbations. A rigorous theoretical analysis is performed to establish the robust generalization guarantees of the proposed framework. Experimental results on image and text transmission demonstrate that WaSeCom achieves improved robustness under noise and adversarial perturbations. These results highlight its effectiveness in preserving semantic fidelity across varying wireless conditions.

Distributionally Robust Wireless Semantic Communication with Large AI Models

TL;DR

This work addresses robustness in semantic communication for wireless networks by introducing WaSeCom, a bilevel framework based on WDRO to hedge both semantic interpretation errors and channel distortions. It leverages dual WDRO formulations and log-sum-exp smoothing to enable scalable training of large AI backbones (e.g., ViT, BERT) for text and image modalities, with two independent Wasserstein ambiguity sets for semantic inputs and channel outputs. Theoretical results establish generalization guarantees and convergence to robust stationary points, while empirical results on CIFAR-10 and Europarl demonstrate improved robustness under semantic perturbations and wireless channel variability without sacrificing nominal performance. The approach offers a model-agnostic, end-to-end solution that enhances semantic fidelity under adverse wireless conditions, with practical guidelines for hyperparameter tuning and potential extensions to time-varying channels.

Abstract

Semantic communication (SemCom) has emerged as a promising paradigm for 6G wireless systems by transmitting task-relevant information rather than raw bits, yet existing approaches remain vulnerable to dual sources of uncertainty: semantic misinterpretation arising from imperfect feature extraction and transmission-level perturbations from channel noise. Current deep learning based SemCom systems typically employ domain-specific architectures that lack robustness guarantees and fail to generalize across diverse noise conditions, adversarial attacks, and out-of-distribution data. In this paper, a novel and generalized semantic communication framework called WaSeCom is proposed to systematically address uncertainty and enhance robustness. In particular, Wasserstein distributionally robust optimization is employed to provide resilience against semantic misinterpretation and channel perturbations. A rigorous theoretical analysis is performed to establish the robust generalization guarantees of the proposed framework. Experimental results on image and text transmission demonstrate that WaSeCom achieves improved robustness under noise and adversarial perturbations. These results highlight its effectiveness in preserving semantic fidelity across varying wireless conditions.

Paper Structure

This paper contains 36 sections, 3 theorems, 46 equations, 10 figures, 4 tables, 1 algorithm.

Key Result

Lemma 1

Suppose $f \in \mathcal{F}_{\text{s}}$ is $L_{\text{s}}$-Lipschitz and $g \in \mathcal{F}_{\text{c}}$ is $L_{\text{c}}$-Lipschitz. If $\lambda \ge L_{\text{s}}/\rho$ and $\gamma \ge L_{\text{c}}/\mu$, then for any $P' \in \mathcal{B}(P, \rho)$ and $Z' \in \mathcal{B}(Z, \mu)$: where $\lambda^*$ and $\gamma^*$ are the optimal dual variables corresponding to the inner and outer WDRO problems, respe

Figures (10)

  • Figure 1: Large AI-enabled wireless semantic communication. Source data is encoded into compact, task-relevant representations and transmitted over wireless channels; the receiver reconstructs the intended meaning, even under semantic and channel noise.
  • Figure 2: WDRO aims to find model parameters $\theta$ that minimize the worst-case expected objective $f_\theta$ assuming the true data distribution $Q$ is within a small Wasserstein ball $\mathcal{B}_p$ of the empirical distribution $P$.
  • Figure 3: Overview of the $\small{\textsf{WaSeCom}}$ framework. The proposed bilevel WDRO model jointly optimizes semantic and channel encoder-decoder pairs for robustness. The inner level (orange) handles semantic input shifts, while the outer level (blue) addresses channel noise.
  • Figure 4: Performance of image transmission tasks with different semantic noise ratio under AWGN channel
  • Figure 5: Performance of image transmission tasks with different semantic noise ratio under Rayleigh channel
  • ...and 5 more figures

Theorems & Definitions (11)

  • Definition 1
  • Remark : On Assumptions
  • Definition 2: Expected Risks and Surrogate Risks
  • Definition 3: Excess Risks and Robust Excess Risks
  • Lemma 1: Bi-level Surrogate Excess Risk Bounds
  • Remark
  • Theorem 1: Robust generalization bounds
  • Remark
  • Proposition 1
  • proof
  • ...and 1 more