Table of Contents
Fetching ...

Knowledge-Aware Semantic Communication System Design and Data Allocation

Sachin Kadam, Dong In Kim

TL;DR

This paper designs a semantic communication system that transmits only keywords extracted from text via a shared knowledge base, reducing transmission overhead while preserving a tunable level of semantic accuracy. It introduces an auto-encoder/decoder architecture and a combined BLEU–BERT based Semantic Score to quantify semantic loss, enabling a controllable trade-off between overhead and accuracy. The authors formulate the Data Allocation Problem (DAP) to optimally distribute dataset categories to users under storage and budget constraints, prove its NP-completeness, and propose a greedy algorithm that achieves near-optimal profits in simulations. Through extensive experiments on a soccer-commentary dataset, the proposed SemCom design outperforms state-of-the-art schemes in terms of average words per sentence for a given accuracy, and the greedy DAP solution closely matches the optimal solution, validating the practicality of the approach for 6G-type systems and data-center services.

Abstract

The recent emergence of 6G raises the challenge of increasing the transmission data rate even further in order to overcome the Shannon limit. Traditional communication methods fall short of the 6G goals, paving the way for Semantic Communication (SemCom) systems that have applications in the metaverse, healthcare, economics, etc. In SemCom systems, only the relevant keywords from the data are extracted and used for transmission. In this paper, we design an auto-encoder and auto-decoder that only transmit these keywords and, respectively, recover the data using the received keywords and the shared knowledge. This SemCom system is used in a setup in which the receiver allocates various categories of the same dataset collected from the transmitter, which differ in size and accuracy, to a number of users. This scenario is formulated using an optimization problem called the data allocation problem (DAP). We show that it is NP-complete and propose a greedy algorithm to solve it. Using simulations, we show that the proposed methods for SemCom system design outperform state-of-the-art methods in terms of average number of words per sentence for a given accuracy, and that the proposed greedy algorithm solution of the DAP performs significantly close to the optimal solution.

Knowledge-Aware Semantic Communication System Design and Data Allocation

TL;DR

This paper designs a semantic communication system that transmits only keywords extracted from text via a shared knowledge base, reducing transmission overhead while preserving a tunable level of semantic accuracy. It introduces an auto-encoder/decoder architecture and a combined BLEU–BERT based Semantic Score to quantify semantic loss, enabling a controllable trade-off between overhead and accuracy. The authors formulate the Data Allocation Problem (DAP) to optimally distribute dataset categories to users under storage and budget constraints, prove its NP-completeness, and propose a greedy algorithm that achieves near-optimal profits in simulations. Through extensive experiments on a soccer-commentary dataset, the proposed SemCom design outperforms state-of-the-art schemes in terms of average words per sentence for a given accuracy, and the greedy DAP solution closely matches the optimal solution, validating the practicality of the approach for 6G-type systems and data-center services.

Abstract

The recent emergence of 6G raises the challenge of increasing the transmission data rate even further in order to overcome the Shannon limit. Traditional communication methods fall short of the 6G goals, paving the way for Semantic Communication (SemCom) systems that have applications in the metaverse, healthcare, economics, etc. In SemCom systems, only the relevant keywords from the data are extracted and used for transmission. In this paper, we design an auto-encoder and auto-decoder that only transmit these keywords and, respectively, recover the data using the received keywords and the shared knowledge. This SemCom system is used in a setup in which the receiver allocates various categories of the same dataset collected from the transmitter, which differ in size and accuracy, to a number of users. This scenario is formulated using an optimization problem called the data allocation problem (DAP). We show that it is NP-complete and propose a greedy algorithm to solve it. Using simulations, we show that the proposed methods for SemCom system design outperform state-of-the-art methods in terms of average number of words per sentence for a given accuracy, and that the proposed greedy algorithm solution of the DAP performs significantly close to the optimal solution.
Paper Structure (20 sections, 1 theorem, 28 equations, 10 figures, 4 tables, 1 algorithm)

This paper contains 20 sections, 1 theorem, 28 equations, 10 figures, 4 tables, 1 algorithm.

Key Result

Theorem 1

The DAP is NP-complete.

Figures (10)

  • Figure 1: The block diagram of our proposed SemCom system model. The model in Fig. 1(a) is used for training the system parameters and the model in Fig. 1(b) is used for evaluating the SemCom system.
  • Figure 2: The architecture of the semantic encoder/decoder and channel encoder/decoder models of the proposed SemCom system model.
  • Figure 3: This figure shows the setup used to describe the data allocation problem (DAP).
  • Figure 4: This plot shows the BLEU score vs. $\rho$ for different values of $n$-grams, where $n=\{1,2,3,4\}$, for the proposed schemes and the DeepSC scheme xie2021deep.
  • Figure 5: These plots show the average number of words per sentence vs. $\rho$ in the left plot and vs. $\tau$ in the right plot, respectively, for the proposed schemes and the DeepSC scheme xie2021deep.
  • ...and 5 more figures

Theorems & Definitions (4)

  • Remark 1
  • Theorem 1
  • proof
  • Remark 2