Neurosymbolic Deep Learning Semantics
Artur d'Avila Garcez, Simon Odense
TL;DR
The paper argues that semantics are essential for AI-driven science and proposes a neurosymbolic semantic-encoding framework that links neural networks to logical interpretations using an Encoding Map $i$, stable states set $X_{inf}$, and Aggregation $Agg$. It surveys a spectrum of encoding techniques—from strong to soft to hard, including propositional, first-order, fuzzy, probabilistic, and modal logics—and formalizes a general framework that unifies these approaches. Key contributions include a rigorous semantic-encoding definition, the Wordstar-inspired limits on encodings, fidelity metrics to assess approximate encodings, and a learning-theoretic perspective showing how background knowledge can affect generalization via principled assumptions. The work lays a foundation for a theory of semantic encoding to reconcile learning and reasoning, aiming to guide the design of encodings that preserve data structure and improve generalization in deep learning systems, with potential connections to modal logic and program synthesis in neurosymbolic AI.
Abstract
Artificial Intelligence (AI) is a powerful new language of science as evidenced by recent Nobel Prizes in chemistry and physics that recognized contributions to AI applied to those areas. Yet, this new language lacks semantics, which makes AI's scientific discoveries unsatisfactory at best. With the purpose of uncovering new facts but also improving our understanding of the world, AI-based science requires formalization through a framework capable of translating insight into comprehensible scientific knowledge. In this paper, we argue that logic offers an adequate framework. In particular, we use logic in a neurosymbolic framework to offer a much needed semantics for deep learning, the neural network-based technology of current AI. Deep learning and neurosymbolic AI lack a general set of conditions to ensure that desirable properties are satisfied. Instead, there is a plethora of encoding and knowledge extraction approaches designed for particular cases. To rectify this, we introduced a framework for semantic encoding, making explicit the mapping between neural networks and logic, and characterizing the common ingredients of the various existing approaches. In this paper, we describe succinctly and exemplify how logical semantics and neural networks are linked through this framework, we review some of the most prominent approaches and techniques developed for neural encoding and knowledge extraction, provide a formal definition of our framework, and discuss some of the difficulties of identifying a semantic encoding in practice in light of analogous problems in the philosophy of mind.
