On The Theory of Semantic Information and Communication for Logical Inference

Ahmet Faruk Saz; Siheng Xiong; Faramarz Fekri

On The Theory of Semantic Information and Communication for Logical Inference

Ahmet Faruk Saz, Siheng Xiong, Faramarz Fekri

TL;DR

This work tackles the lack of a formal theory for semantic information in logical inference by adopting First-Order Logic as a universal world model and separating the physical cost of symbol transmission from the semantic content it conveys. It introduces a calibrated content-information measure within a two-level framework (Level 1: $H_{\text{phys}}$, Level 2: $H_{\text{cont}}$) calibrated by language granularity $G$, and develops inductive posterior updates over semantic constituents using evidential sub-languages. A semantic rate-distortion theory is formulated, with tractable inference achieved through these sub-languages, and validated via semantic compression experiments and improved deductive performance in the LogiCity driving simulator. The results reveal a clear trade-off between transmission rate and preserved semantic content, offering an interpretable, mathematically grounded alternative to black-box semantic communication for inference tasks.

Abstract

First-Order Logic (FOL), also called first-order predicate calculus, is a formal language that provides a framework to comprehensively represent a world and its present state, including all of its entities, attributes, and complex interrelations, irrespective of their physical modality (e.g., text, image, or sensor data). Grounded in this universal representation, this paper develops a mathematical theory for semantic information and communication tailored to tasks involving logical reasoning and inference. For semantic communication, our framework distinguishes between two fundamental components: the physical cost of transmitting symbols of the FOL language and the logical content those symbols represent. A calibrated measure for semantic content is proposed, which allows for the consistent comparison of information value across different logical systems. This measure quantifies the degree to which a message reduces uncertainty about the true state of the world. Building on this measure, semantic entropy, conditional and mutual information metrics are defined. These metrics are then used to formulate optimizable objectives for semantic communication, designed to preserve the information most relevant for logical reasoning task at the receiver while adhering to a transmission budget. The framework's operational value is demonstrated through experiments in semantic compression, where the proposed objectives are used to manage the trade-off between transmission cost and the preservation of logical content; and deductive inference, where increasing world-state awareness improves deduction performance.

On The Theory of Semantic Information and Communication for Logical Inference

TL;DR

, Level 2:

) calibrated by language granularity

, and develops inductive posterior updates over semantic constituents using evidential sub-languages. A semantic rate-distortion theory is formulated, with tractable inference achieved through these sub-languages, and validated via semantic compression experiments and improved deductive performance in the LogiCity driving simulator. The results reveal a clear trade-off between transmission rate and preserved semantic content, offering an interpretable, mathematically grounded alternative to black-box semantic communication for inference tasks.

Abstract

Paper Structure (16 sections, 2 theorems, 18 equations, 1 figure, 3 tables)

This paper contains 16 sections, 2 theorems, 18 equations, 1 figure, 3 tables.

Introduction
Related Works
First-Order Language for Semantic Structures
Inductive Probability over Semantic Structures
Derivation of the Inductive Characteristic Function
Computing the Inductive Posterior over Constituents
A Quantitative Framework for Semantic Information
The Classical Foundation and Its Limits
A Bifurcated Theory of Semantic Information
Level 1: Physical Entropy of the Signifier
Level 2: Calibrated Content of the Signified
Axiomatic Foundation of the Content Measure
Semantic Compression Objectives
Experimental Validation
Experimental Setup and Results
...and 1 more sections

Key Result

Proposition 1

Given a logical partition $\mathcal{M}=\{m_i\}$ of a universe with granularity $G$, the average content-information of a source given evidence $e$ is:

Figures (1)

Figure 1: Content-Semantic Source Coding

Theorems & Definitions (6)

Definition 1: Carnap's Content Measure
Definition 2: Physical Semantic Entropy
Definition 3: Language Granularity
Definition 4: Calibrated Content-Information
Proposition 1: Average Semantic Content-Information
Theorem 1: Semantic Source Coding Objectives

On The Theory of Semantic Information and Communication for Logical Inference

TL;DR

Abstract

On The Theory of Semantic Information and Communication for Logical Inference

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (1)

Theorems & Definitions (6)