Task-Driven Semantic Quantization and Imitation Learning for Goal-Oriented Communications
Yu-Chieh Chao, Yubei Chen, Weiwei Wang, Achintha Wijesinghe, Suchinthaka Wanninayaka, Songyang Zhang, Zhi Ding
TL;DR
The paper tackles bandwidth-efficient, goal-oriented communication by redefining semantic transmission through a GO-COM pipeline. It introduces GOS-VAE, which uses a lightweight encoder with VQ-VAE codebooks and imitation learning to preserve downstream-task semantics—here, semantic segmentation—via end-to-end training with a powerful back-end decoder and task model. Key contributions include integrating LPIPS as perceptual regularization, employing Jensen-Shannon-based imitation targets, and demonstrating superior performance on Cityscapes and ADE20K with substantially reduced payload compared to baselines like diffusion-based GESCO. The approach yields robust, task-focused reconstruction quality, enabling practical deployment in edge-to-cloud scenarios for AI-driven applications such as autonomous driving. Overall, GOS-VAE advances task-driven semantic compression by aligning reconstruction with downstream semantic requirements while maintaining bandwidth efficiency.
Abstract
Semantic communication marks a new paradigm shift from bit-wise data transmission to semantic information delivery for the purpose of bandwidth reduction. To more effectively carry out specialized downstream tasks at the receiver end, it is crucial to define the most critical semantic message in the data based on the task or goal-oriented features. In this work, we propose a novel goal-oriented communication (GO-COM) framework, namely Goal-Oriented Semantic Variational Autoencoder (GOS-VAE), by focusing on the extraction of the semantics vital to the downstream tasks. Specifically, we adopt a Vector Quantized Variational Autoencoder (VQ-VAE) to compress media data at the transmitter side. Instead of targeting the pixel-wise image data reconstruction, we measure the quality-of-service at the receiver end based on a pre-defined task-incentivized model. Moreover, to capture the relevant semantic features in the data reconstruction, imitation learning is adopted to measure the data regeneration quality in terms of goal-oriented semantics. Our experimental results demonstrate the power of imitation learning in characterizing goal-oriented semantics and bandwidth efficiency of our proposed GOS-VAE.
