Table of Contents
Fetching ...

Semantic Huffman Coding using Synonymous Mapping

Jin Xu, Kai Niu, Zijian Liang, Ping Zhang

TL;DR

This work proposes semantic Huffman coding, which leverages synonymous mapping and synonymous sets to construct Huffman code trees based on semantic rather than purely syntactic probabilities. By merging leaves corresponding to synonymous sets, the method reduces average code length toward the semantic entropy $H_s(\tilde{\mathcal{U}})$, and it is grounded in the semantic source coding theorem and semantic Kraft inequality. The approach yields shorter codes than classical Huffman coding while preserving semantics under semantic lossless, as demonstrated in text-based experiments using Shannon's foundational paper. The results indicate practical compression gains and provide a pathway to surpass traditional Shannon limits by incorporating semantic structure into source coding.

Abstract

Semantic communication stands out as a highly promising avenue for future developments in communications. Theoretically, source compression coding based on semantics can achieve lower rates than Shannon entropy. This paper introduces a semantic Huffman coding built upon semantic information theory. By incorporating synonymous mapping and synonymous sets, semantic Huffman coding can achieve shorter average code lengths. Furthermore, we demonstrate that semantic Huffman coding theoretically have the capability to approximate semantic entropy. Experimental results indicate that, under the condition of semantic lossless, semantic Huffman coding exhibits clear advantages in compression efficiency over classical Huffman coding.

Semantic Huffman Coding using Synonymous Mapping

TL;DR

This work proposes semantic Huffman coding, which leverages synonymous mapping and synonymous sets to construct Huffman code trees based on semantic rather than purely syntactic probabilities. By merging leaves corresponding to synonymous sets, the method reduces average code length toward the semantic entropy , and it is grounded in the semantic source coding theorem and semantic Kraft inequality. The approach yields shorter codes than classical Huffman coding while preserving semantics under semantic lossless, as demonstrated in text-based experiments using Shannon's foundational paper. The results indicate practical compression gains and provide a pathway to surpass traditional Shannon limits by incorporating semantic structure into source coding.

Abstract

Semantic communication stands out as a highly promising avenue for future developments in communications. Theoretically, source compression coding based on semantics can achieve lower rates than Shannon entropy. This paper introduces a semantic Huffman coding built upon semantic information theory. By incorporating synonymous mapping and synonymous sets, semantic Huffman coding can achieve shorter average code lengths. Furthermore, we demonstrate that semantic Huffman coding theoretically have the capability to approximate semantic entropy. Experimental results indicate that, under the condition of semantic lossless, semantic Huffman coding exhibits clear advantages in compression efficiency over classical Huffman coding.
Paper Structure (7 sections, 6 equations, 3 figures, 4 tables)

This paper contains 7 sections, 6 equations, 3 figures, 4 tables.

Figures (3)

  • Figure 1: Schematic diagram of the synonymous set and synonymous mapping.
  • Figure 2: The system block chart of the semantic Huffman coding.
  • Figure 3: Comparison of the average code length of semantic Huffman trees and syntactic Huffman trees, in which "sebits" denotes semantic bits for the resulting unit of semantic source coding, presented by niu2024Mathematical.