Table of Contents
Fetching ...

Joint Semantic-Channel Coding and Modulation for Token Communications

Jingkai Ying, Zhijin Qin, Yulong Feng, Liejun Wang, Xiaoming Tao

TL;DR

This work introduces a token-based wireless transmission framework for point-cloud geometry that jointly optimizes semantic content and channel coding through JSCCM. By employing two parallel Point Transformer encoders and a differentiable modulator, the system produces informative modulated tokens, while a rate allocator and channel adapter enable adaptive transmission across channel conditions. The approach yields robust end-to-end reconstruction with substantial symbol-efficiency gains, outperforming both purely semantic and traditional separated coding baselines, and remains effective under practical scenarios with real-world data and imperfect CSI. The results highlight the practical potential of token-centric semantic communications for 3D data and point toward extensions to dynamic and multimodal token pipelines.

Abstract

In recent years, the Transformer architecture has achieved outstanding performance across a wide range of tasks and modalities. Token is the unified input and output representation in Transformer-based models, which has become a fundamental information unit. In this work, we consider the problem of token communication, studying how to transmit tokens efficiently and reliably. Point cloud, a prevailing three-dimensional format which exhibits a more complex spatial structure compared to image or video, is chosen to be the information source. We utilize the set abstraction method to obtain point tokens. Subsequently, to get a more informative and transmission-friendly representation based on tokens, we propose a joint semantic-channel and modulation (JSCCM) scheme for the token encoder, mapping point tokens to standard digital constellation points (modulated tokens). Specifically, the JSCCM consists of two parallel Point Transformer-based encoders and a differential modulator which combines the Gumel-softmax and soft quantization methods. Besides, the rate allocator and channel adapter are developed, facilitating adaptive generation of high-quality modulated tokens conditioned on both semantic information and channel conditions. Extensive simulations demonstrate that the proposed method outperforms both joint semantic-channel coding and traditional separate coding, achieving over 1dB gain in reconstruction and more than 6x compression ratio in modulated symbols.

Joint Semantic-Channel Coding and Modulation for Token Communications

TL;DR

This work introduces a token-based wireless transmission framework for point-cloud geometry that jointly optimizes semantic content and channel coding through JSCCM. By employing two parallel Point Transformer encoders and a differentiable modulator, the system produces informative modulated tokens, while a rate allocator and channel adapter enable adaptive transmission across channel conditions. The approach yields robust end-to-end reconstruction with substantial symbol-efficiency gains, outperforming both purely semantic and traditional separated coding baselines, and remains effective under practical scenarios with real-world data and imperfect CSI. The results highlight the practical potential of token-centric semantic communications for 3D data and point toward extensions to dynamic and multimodal token pipelines.

Abstract

In recent years, the Transformer architecture has achieved outstanding performance across a wide range of tasks and modalities. Token is the unified input and output representation in Transformer-based models, which has become a fundamental information unit. In this work, we consider the problem of token communication, studying how to transmit tokens efficiently and reliably. Point cloud, a prevailing three-dimensional format which exhibits a more complex spatial structure compared to image or video, is chosen to be the information source. We utilize the set abstraction method to obtain point tokens. Subsequently, to get a more informative and transmission-friendly representation based on tokens, we propose a joint semantic-channel and modulation (JSCCM) scheme for the token encoder, mapping point tokens to standard digital constellation points (modulated tokens). Specifically, the JSCCM consists of two parallel Point Transformer-based encoders and a differential modulator which combines the Gumel-softmax and soft quantization methods. Besides, the rate allocator and channel adapter are developed, facilitating adaptive generation of high-quality modulated tokens conditioned on both semantic information and channel conditions. Extensive simulations demonstrate that the proposed method outperforms both joint semantic-channel coding and traditional separate coding, achieving over 1dB gain in reconstruction and more than 6x compression ratio in modulated symbols.

Paper Structure

This paper contains 24 sections, 24 equations, 14 figures, 2 tables.

Figures (14)

  • Figure 1: Schematic of token communications qiao2025token.
  • Figure 2: The overall framework of the proposed token communication system for point cloud geometry transmission. Blocks with dashed borders indicate that they are optional. The rate allocator and channel adapter are required when relevant adaptive characteristics are needed. If Rayleigh fading channels are considered, the equalizer will be used.
  • Figure 3: The model architecture of JSCC encoders. The numbers in brackets are dimension information. (a) The diagram of two parallel JSCC encoders, including a main JSCC encoder and an auxiliary encoder. (b) The diagram of a Point Transformer layer, which conducts vector attention. (c) The diagram of a Point Transformer Block.
  • Figure 4: An illustrative example of the proposed differentiable modulation method. 64-QAM is used in this example, which means $M=64$.
  • Figure 5: The model architectures of the rate allocator and channel adapter.
  • ...and 9 more figures