Table of Contents
Fetching ...

Breaking through the classical Shannon entropy limit: A new frontier through logical semantics

Luis A. Lastras, Barry M. Trager, Jonathan Lenchner, Wojciech Szpankowski, Chai Wah Wu, Mark S. Squillante, Alexander Gray

TL;DR

It is argued that proposals for a semantic information theory should include the power of deductive reasoning to magnify the value of transmitted bits as the authors strive to fully unlock the inherent potential of semantics.

Abstract

Information theory has provided foundations for the theories of several application areas critical for modern society, including communications, computer storage, and AI. A key aspect of Shannon's 1948 theory is a sharp lower bound on the number of bits needed to encode and communicate a string of symbols. When he introduced the theory, Shannon famously excluded any notion of semantics behind the symbols being communicated. This semantics-free notion went on to have massive impact on communication and computing technologies, even as multiple proposals for reintroducing semantics in a theory of information were being made, notably one where Carnap and Bar-Hillel used logic and reasoning to capture semantics. In this paper we present, for the first time, a Shannon-style analysis of a communication system equipped with a deductive reasoning capability, implemented using logical inference. We use some of the most important techniques developed in information theory to demonstrate significant and sometimes surprising gains in communication efficiency availed to us through such capability, demonstrated also through practical codes. We thus argue that proposals for a semantic information theory should include the power of deductive reasoning to magnify the value of transmitted bits as we strive to fully unlock the inherent potential of semantics.

Breaking through the classical Shannon entropy limit: A new frontier through logical semantics

TL;DR

It is argued that proposals for a semantic information theory should include the power of deductive reasoning to magnify the value of transmitted bits as the authors strive to fully unlock the inherent potential of semantics.

Abstract

Information theory has provided foundations for the theories of several application areas critical for modern society, including communications, computer storage, and AI. A key aspect of Shannon's 1948 theory is a sharp lower bound on the number of bits needed to encode and communicate a string of symbols. When he introduced the theory, Shannon famously excluded any notion of semantics behind the symbols being communicated. This semantics-free notion went on to have massive impact on communication and computing technologies, even as multiple proposals for reintroducing semantics in a theory of information were being made, notably one where Carnap and Bar-Hillel used logic and reasoning to capture semantics. In this paper we present, for the first time, a Shannon-style analysis of a communication system equipped with a deductive reasoning capability, implemented using logical inference. We use some of the most important techniques developed in information theory to demonstrate significant and sometimes surprising gains in communication efficiency availed to us through such capability, demonstrated also through practical codes. We thus argue that proposals for a semantic information theory should include the power of deductive reasoning to magnify the value of transmitted bits as we strive to fully unlock the inherent potential of semantics.
Paper Structure (5 sections, 1 theorem, 1 equation, 3 figures)

This paper contains 5 sections, 1 theorem, 1 equation, 3 figures.

Key Result

Theorem 1

For any distribution over $( \mathtt{S_m}, \mathtt{Q_m}, \mathtt{R_m} )$ meeting the entailment conditions $\mathtt{S_m} \vdash \mathtt{Q_m}$ and $\mathtt{Q_m} \vdash \mathtt{R_m}$, if the corresponding kernels have normalized sizes $p_s,p_q,p_r$, respectively, then for the case Alice knows $\math

Figures (3)

  • Figure 1: Communication problem setup and algorithm innovations. (a) General communication diagram which allows for Bob to possess facts that Alice may or may not know, and which allows for the query $\mathtt{Q_m}$ that Alice is helping Bob prove to be anything from her full knowledge $\mathtt{S_m}$ to a more targeted logic statement. (b)-(e) Kernel diagrams showing equivalence, not, conjunction, and entailment. (f) Theoretically optimum solution architectures for the cases where Alice knows and does not know Bob's knowledge, respectively. (g) Probabilistic model for kernels where the parameters $p_s, p_q, p_r$ are average kernel sizes, illustrated in the case Alice's knowledge implies that of Bob's. (h) No need to know decoding mechanism example where Alice mapped her kernel to a hash bin from 4 possible ones; Bob is able to reconstruct ${\kappa( \mathtt{S_m})}$ without Alice knowing $\mathtt{R_m}$ ($p_r < 1$) by rejecting all kernels that do not entail $\mathtt{R_m}$ and that do not match the received bin index.
  • Figure 2: Experimental results for two scenarios. (a) Results for the case Alice knows $\mathtt{R_m}$, $p_r=0.5, p_s=0.075$ and $0.125 \leq p_q \leq 0.425$ demonstrating the significant gains in communication cost using practical semantic communication codes compared to purely classical approaches. (b) Results for the setting $p_r < 0.5$ and $p_s=p_q=0.075$ in the case Alice doesn't know $\mathtt{R_m}$, also comparing a classical approach with one leveraging logical semantics, as a multiple of the Shannon bound.
  • Figure 3: (a) Less is More. In blue, the ultimate communication limit $\Lambda$ for the case $p_r=1$, as the query ranges from trivial ($p_q=1$) to coinciding with the sender's information ($p_q=p_s=0.15$). $\Lambda$ is cheaper (Less...) than the two obvious strategies, yet the kernel size received by Bob is smaller than that of the query, showing Bob can prove even more things (is More...) than required. A similar picture will hold for any $p_r$. (b)-(c) Kernel diagrams that model Bob's state of ignorance versus misinformation. (d) Relative cost of misinformation to ignorance. We compare the ultimate limits for (b) (ignorance) and (c) (misinformation); as Bob becomes more opinionated ($p_r \rightarrow p_s$, $p_s=0.1$), the ratio goes to infinity.

Theorems & Definitions (1)

  • Theorem 1