Table of Contents
Fetching ...

Task-Driven Semantic Quantization and Imitation Learning for Goal-Oriented Communications

Yu-Chieh Chao, Yubei Chen, Weiwei Wang, Achintha Wijesinghe, Suchinthaka Wanninayaka, Songyang Zhang, Zhi Ding

TL;DR

The paper tackles bandwidth-efficient, goal-oriented communication by redefining semantic transmission through a GO-COM pipeline. It introduces GOS-VAE, which uses a lightweight encoder with VQ-VAE codebooks and imitation learning to preserve downstream-task semantics—here, semantic segmentation—via end-to-end training with a powerful back-end decoder and task model. Key contributions include integrating LPIPS as perceptual regularization, employing Jensen-Shannon-based imitation targets, and demonstrating superior performance on Cityscapes and ADE20K with substantially reduced payload compared to baselines like diffusion-based GESCO. The approach yields robust, task-focused reconstruction quality, enabling practical deployment in edge-to-cloud scenarios for AI-driven applications such as autonomous driving. Overall, GOS-VAE advances task-driven semantic compression by aligning reconstruction with downstream semantic requirements while maintaining bandwidth efficiency.

Abstract

Semantic communication marks a new paradigm shift from bit-wise data transmission to semantic information delivery for the purpose of bandwidth reduction. To more effectively carry out specialized downstream tasks at the receiver end, it is crucial to define the most critical semantic message in the data based on the task or goal-oriented features. In this work, we propose a novel goal-oriented communication (GO-COM) framework, namely Goal-Oriented Semantic Variational Autoencoder (GOS-VAE), by focusing on the extraction of the semantics vital to the downstream tasks. Specifically, we adopt a Vector Quantized Variational Autoencoder (VQ-VAE) to compress media data at the transmitter side. Instead of targeting the pixel-wise image data reconstruction, we measure the quality-of-service at the receiver end based on a pre-defined task-incentivized model. Moreover, to capture the relevant semantic features in the data reconstruction, imitation learning is adopted to measure the data regeneration quality in terms of goal-oriented semantics. Our experimental results demonstrate the power of imitation learning in characterizing goal-oriented semantics and bandwidth efficiency of our proposed GOS-VAE.

Task-Driven Semantic Quantization and Imitation Learning for Goal-Oriented Communications

TL;DR

The paper tackles bandwidth-efficient, goal-oriented communication by redefining semantic transmission through a GO-COM pipeline. It introduces GOS-VAE, which uses a lightweight encoder with VQ-VAE codebooks and imitation learning to preserve downstream-task semantics—here, semantic segmentation—via end-to-end training with a powerful back-end decoder and task model. Key contributions include integrating LPIPS as perceptual regularization, employing Jensen-Shannon-based imitation targets, and demonstrating superior performance on Cityscapes and ADE20K with substantially reduced payload compared to baselines like diffusion-based GESCO. The approach yields robust, task-focused reconstruction quality, enabling practical deployment in edge-to-cloud scenarios for AI-driven applications such as autonomous driving. Overall, GOS-VAE advances task-driven semantic compression by aligning reconstruction with downstream semantic requirements while maintaining bandwidth efficiency.

Abstract

Semantic communication marks a new paradigm shift from bit-wise data transmission to semantic information delivery for the purpose of bandwidth reduction. To more effectively carry out specialized downstream tasks at the receiver end, it is crucial to define the most critical semantic message in the data based on the task or goal-oriented features. In this work, we propose a novel goal-oriented communication (GO-COM) framework, namely Goal-Oriented Semantic Variational Autoencoder (GOS-VAE), by focusing on the extraction of the semantics vital to the downstream tasks. Specifically, we adopt a Vector Quantized Variational Autoencoder (VQ-VAE) to compress media data at the transmitter side. Instead of targeting the pixel-wise image data reconstruction, we measure the quality-of-service at the receiver end based on a pre-defined task-incentivized model. Moreover, to capture the relevant semantic features in the data reconstruction, imitation learning is adopted to measure the data regeneration quality in terms of goal-oriented semantics. Our experimental results demonstrate the power of imitation learning in characterizing goal-oriented semantics and bandwidth efficiency of our proposed GOS-VAE.

Paper Structure

This paper contains 20 sections, 8 equations, 4 figures, 4 tables.

Figures (4)

  • Figure 1: The proposed Goal-Oriented Semantic Variational Autoencoder (GOS-VAE) framework for Semantic Communication: The snowflakes symbol denotes pre-trained and fixed model parameters.
  • Figure 2: Visualization Results of Different Methods on Cityscapes Dataset.
  • Figure 3: Training curves of semantic segmentation loss (JSD) and LPIPS for GOS-VAE on the ADE20K dataset. The correlation of the two curves is 0.976.
  • Figure 4: Performance comparisons of models on the Cityscapes dataset under different compression ratios (r).