Table of Contents
Fetching ...

Knowledge-Aware Semantic Communication System Design

Sachin Kadam, Dong In Kim

TL;DR

This work tackles the data-rate bottleneck in semantic communication (SemCom) for 6G by leveraging a shared knowledge base (KB) to extract and transmit only keywords from text. It designs an auto-encoder/decoder framework that encodes keywords and uses the KB at both ends to reconstruct sentences, while optimizing a semantic distortion objective $\mathcal{L}$ with an upper bound $B$ and employing mutual-information estimation via MINE. Experiments on a football/soccer commentary dataset show that the proposed method reduces the average number of transmitted words compared to a state-of-the-art baseline (e.g., DeepSC) while maintaining meaningful reconstruction quality measured by $BLEU$, with a tunable trade-off controlled by the KB size parameter $\rho$ and required accuracy $\tau$. The approach offers a practical path to higher-throughput SemCom without hardware changes and motivates extending the KB-based semantic compression paradigm to other data modalities such as image, audio, and video.

Abstract

The recent emergence of 6G raises the challenge of increasing the transmission data rate even further in order to break the barrier set by the Shannon limit. Traditional communication methods fall short of the 6G goals, paving the way for Semantic Communication (SemCom) systems. These systems find applications in wide range of fields such as economics, metaverse, autonomous transportation systems, healthcare, smart factories, etc. In SemCom systems, only the relevant information from the data, known as semantic data, is extracted to eliminate unwanted overheads in the raw data and then transmitted after encoding. In this paper, we first use the shared knowledge base to extract the keywords from the dataset. Then, we design an auto-encoder and auto-decoder that only transmit these keywords and, respectively, recover the data using the received keywords and the shared knowledge. We show analytically that the overall semantic distortion function has an upper bound, which is shown in the literature to converge. We numerically compute the accuracy of the reconstructed sentences at the receiver. Using simulations, we show that the proposed methods outperform a state-of-the-art method in terms of the average number of words per sentence.

Knowledge-Aware Semantic Communication System Design

TL;DR

This work tackles the data-rate bottleneck in semantic communication (SemCom) for 6G by leveraging a shared knowledge base (KB) to extract and transmit only keywords from text. It designs an auto-encoder/decoder framework that encodes keywords and uses the KB at both ends to reconstruct sentences, while optimizing a semantic distortion objective with an upper bound and employing mutual-information estimation via MINE. Experiments on a football/soccer commentary dataset show that the proposed method reduces the average number of transmitted words compared to a state-of-the-art baseline (e.g., DeepSC) while maintaining meaningful reconstruction quality measured by , with a tunable trade-off controlled by the KB size parameter and required accuracy . The approach offers a practical path to higher-throughput SemCom without hardware changes and motivates extending the KB-based semantic compression paradigm to other data modalities such as image, audio, and video.

Abstract

The recent emergence of 6G raises the challenge of increasing the transmission data rate even further in order to break the barrier set by the Shannon limit. Traditional communication methods fall short of the 6G goals, paving the way for Semantic Communication (SemCom) systems. These systems find applications in wide range of fields such as economics, metaverse, autonomous transportation systems, healthcare, smart factories, etc. In SemCom systems, only the relevant information from the data, known as semantic data, is extracted to eliminate unwanted overheads in the raw data and then transmitted after encoding. In this paper, we first use the shared knowledge base to extract the keywords from the dataset. Then, we design an auto-encoder and auto-decoder that only transmit these keywords and, respectively, recover the data using the received keywords and the shared knowledge. We show analytically that the overall semantic distortion function has an upper bound, which is shown in the literature to converge. We numerically compute the accuracy of the reconstructed sentences at the receiver. Using simulations, we show that the proposed methods outperform a state-of-the-art method in terms of the average number of words per sentence.
Paper Structure (8 sections, 1 theorem, 17 equations, 3 figures, 1 table)

This paper contains 8 sections, 1 theorem, 17 equations, 3 figures, 1 table.

Key Result

Theorem 1

The overall semantic distortion function attains the following upper-bound:

Figures (3)

  • Figure 1: The block diagram of our proposed SemCom system model. The model in Fig. (a) is used for training the system parameters and the model in Fig. (b) is used for evaluating the system model.
  • Figure 2: This plot shows the BLEU score vs. $\rho$ for different values of $n$-grams, where $n=\{1,2,3,4\}$, for the proposed schemes and the DeepSC scheme xie2021deep.
  • Figure 3: These plots show the average number of words per sentence vs. $\rho$ in the left plot and vs. $\tau$ in the right plot, respectively, for the proposed schemes and the DeepSC scheme xie2021deep.

Theorems & Definitions (3)

  • Theorem 1
  • proof
  • Remark 1