On The Theory of Semantic Information and Communication for Logical Inference
Ahmet Faruk Saz, Siheng Xiong, Faramarz Fekri
TL;DR
This work tackles the lack of a formal theory for semantic information in logical inference by adopting First-Order Logic as a universal world model and separating the physical cost of symbol transmission from the semantic content it conveys. It introduces a calibrated content-information measure within a two-level framework (Level 1: $H_{\text{phys}}$, Level 2: $H_{\text{cont}}$) calibrated by language granularity $G$, and develops inductive posterior updates over semantic constituents using evidential sub-languages. A semantic rate-distortion theory is formulated, with tractable inference achieved through these sub-languages, and validated via semantic compression experiments and improved deductive performance in the LogiCity driving simulator. The results reveal a clear trade-off between transmission rate and preserved semantic content, offering an interpretable, mathematically grounded alternative to black-box semantic communication for inference tasks.
Abstract
First-Order Logic (FOL), also called first-order predicate calculus, is a formal language that provides a framework to comprehensively represent a world and its present state, including all of its entities, attributes, and complex interrelations, irrespective of their physical modality (e.g., text, image, or sensor data). Grounded in this universal representation, this paper develops a mathematical theory for semantic information and communication tailored to tasks involving logical reasoning and inference. For semantic communication, our framework distinguishes between two fundamental components: the physical cost of transmitting symbols of the FOL language and the logical content those symbols represent. A calibrated measure for semantic content is proposed, which allows for the consistent comparison of information value across different logical systems. This measure quantifies the degree to which a message reduces uncertainty about the true state of the world. Building on this measure, semantic entropy, conditional and mutual information metrics are defined. These metrics are then used to formulate optimizable objectives for semantic communication, designed to preserve the information most relevant for logical reasoning task at the receiver while adhering to a transmission budget. The framework's operational value is demonstrated through experiments in semantic compression, where the proposed objectives are used to manage the trade-off between transmission cost and the preservation of logical content; and deductive inference, where increasing world-state awareness improves deduction performance.
