Knowledge-Aware Semantic Communication System Design and Data Allocation
Sachin Kadam, Dong In Kim
TL;DR
This paper designs a semantic communication system that transmits only keywords extracted from text via a shared knowledge base, reducing transmission overhead while preserving a tunable level of semantic accuracy. It introduces an auto-encoder/decoder architecture and a combined BLEU–BERT based Semantic Score to quantify semantic loss, enabling a controllable trade-off between overhead and accuracy. The authors formulate the Data Allocation Problem (DAP) to optimally distribute dataset categories to users under storage and budget constraints, prove its NP-completeness, and propose a greedy algorithm that achieves near-optimal profits in simulations. Through extensive experiments on a soccer-commentary dataset, the proposed SemCom design outperforms state-of-the-art schemes in terms of average words per sentence for a given accuracy, and the greedy DAP solution closely matches the optimal solution, validating the practicality of the approach for 6G-type systems and data-center services.
Abstract
The recent emergence of 6G raises the challenge of increasing the transmission data rate even further in order to overcome the Shannon limit. Traditional communication methods fall short of the 6G goals, paving the way for Semantic Communication (SemCom) systems that have applications in the metaverse, healthcare, economics, etc. In SemCom systems, only the relevant keywords from the data are extracted and used for transmission. In this paper, we design an auto-encoder and auto-decoder that only transmit these keywords and, respectively, recover the data using the received keywords and the shared knowledge. This SemCom system is used in a setup in which the receiver allocates various categories of the same dataset collected from the transmitter, which differ in size and accuracy, to a number of users. This scenario is formulated using an optimization problem called the data allocation problem (DAP). We show that it is NP-complete and propose a greedy algorithm to solve it. Using simulations, we show that the proposed methods for SemCom system design outperform state-of-the-art methods in terms of average number of words per sentence for a given accuracy, and that the proposed greedy algorithm solution of the DAP performs significantly close to the optimal solution.
