Table of Contents
Fetching ...

Reasoning in Neurosymbolic AI

Son Tran, Edjard Mota, Artur d'Avila Garcez

TL;DR

This work presents Logical Boltzmann Machines (LBMs), a principled neurosymbolic framework that translates propositional logic into Restricted Boltzmann Machines (RBMs) to enable formal reasoning via energy minimization. It provides a theoretical equivalence between logical formulas and RBM energies, and develops practical methods for reasoning through sampling and free-energy optimization, as well as translating CNF forms for SAT and MaxSAT tasks. The paper reports empirical results on reasoning accuracy, learning-from-knowledge benchmarks, and MaxSAT performance, and demonstrates how LBMs can act as verifiable modules atop larger neural systems to improve safety, fairness, and data efficiency. Overall, LBMs offer a concrete pathway to integrate principled logic with deep networks, enabling reliable, explainable, and adaptable AI systems with broader implications for accountability in AI deployment.

Abstract

Knowledge representation and reasoning in neural networks have been a long-standing endeavor which has attracted much attention recently. The principled integration of reasoning and learning in neural networks is a main objective of the area of neurosymbolic Artificial Intelligence (AI). In this chapter, a simple energy-based neurosymbolic AI system is described that can represent and reason formally about any propositional logic formula. This creates a powerful combination of learning from data and knowledge and logical reasoning. We start by positioning neurosymbolic AI in the context of the current AI landscape that is unsurprisingly dominated by Large Language Models (LLMs). We identify important challenges of data efficiency, fairness and safety of LLMs that might be addressed by neurosymbolic reasoning systems with formal reasoning capabilities. We then discuss the representation of logic by the specific energy-based system, including illustrative examples and empirical evaluation of the correspondence between logical reasoning and energy minimization using Restricted Boltzmann Machines (RBM). Learning from data and knowledge is also evaluated empirically and compared with a symbolic, neural and a neurosymbolic system. Results reported in this chapter in an accessible way are expected to reignite the research on the use of neural networks as massively-parallel models for logical reasoning and promote the principled integration of reasoning and learning in deep networks. We conclude the chapter with a discussion of the importance of positioning neurosymbolic AI within a broader framework of formal reasoning and accountability in AI, discussing the challenges for neurosynbolic AI to tackle the various known problems of reliability of deep learning.

Reasoning in Neurosymbolic AI

TL;DR

This work presents Logical Boltzmann Machines (LBMs), a principled neurosymbolic framework that translates propositional logic into Restricted Boltzmann Machines (RBMs) to enable formal reasoning via energy minimization. It provides a theoretical equivalence between logical formulas and RBM energies, and develops practical methods for reasoning through sampling and free-energy optimization, as well as translating CNF forms for SAT and MaxSAT tasks. The paper reports empirical results on reasoning accuracy, learning-from-knowledge benchmarks, and MaxSAT performance, and demonstrates how LBMs can act as verifiable modules atop larger neural systems to improve safety, fairness, and data efficiency. Overall, LBMs offer a concrete pathway to integrate principled logic with deep networks, enabling reliable, explainable, and adaptable AI systems with broader implications for accountability in AI deployment.

Abstract

Knowledge representation and reasoning in neural networks have been a long-standing endeavor which has attracted much attention recently. The principled integration of reasoning and learning in neural networks is a main objective of the area of neurosymbolic Artificial Intelligence (AI). In this chapter, a simple energy-based neurosymbolic AI system is described that can represent and reason formally about any propositional logic formula. This creates a powerful combination of learning from data and knowledge and logical reasoning. We start by positioning neurosymbolic AI in the context of the current AI landscape that is unsurprisingly dominated by Large Language Models (LLMs). We identify important challenges of data efficiency, fairness and safety of LLMs that might be addressed by neurosymbolic reasoning systems with formal reasoning capabilities. We then discuss the representation of logic by the specific energy-based system, including illustrative examples and empirical evaluation of the correspondence between logical reasoning and energy minimization using Restricted Boltzmann Machines (RBM). Learning from data and knowledge is also evaluated empirically and compared with a symbolic, neural and a neurosymbolic system. Results reported in this chapter in an accessible way are expected to reignite the research on the use of neural networks as massively-parallel models for logical reasoning and promote the principled integration of reasoning and learning in deep networks. We conclude the chapter with a discussion of the importance of positioning neurosymbolic AI within a broader framework of formal reasoning and accountability in AI, discussing the challenges for neurosynbolic AI to tackle the various known problems of reliability of deep learning.

Paper Structure

This paper contains 32 sections, 4 theorems, 33 equations, 13 figures, 3 tables.

Key Result

Lemma 1

Let $\mathcal{S}_{T_j}$ denote the set of indices of the positive literals $\mathrm{x} _t$ in a conjunctive clause $j$. Let $\mathcal{S}_{K_j}$ denote the set of indices of the negative literals $\mathrm{x} _k$ in $j$. Any SDNF $\varphi \equiv \bigvee _j ( \bigwedge _{t} \mathrm{x} _t \wedge \bigwed

Figures (13)

  • Figure 1: An initial Sudoku board and two branches generated by placing a 3 at position 3 of blocks 1 and 3, respectively, and corresponding final states satisfying the constraints of the game.
  • Figure 2: Free energy term $-\log(1+e^{cx})$ for different confidence values $c$.
  • Figure 3: Linear correlation between satisfiability of a CNF and minimization of the free energy function for various confidence values $c$. Source: Tran_Garcez_2023.
  • Figure 4: An RBM equivalent to the XOR formula $( \mathrm{x} \oplus \mathrm{y} ) \leftrightarrow \mathrm{z}$.
  • Figure 5: Percentage coverage as a measure of completeness as sampling progresses in the RBM. 100% coverage is achieved for the class of formulae with different values for M and N averaged over 100 runs. The number of samples needed to achieve $100\%$ coverage is much lower than the number of possible assignments ($2^{M+N}$). For example, when M=20, N=10, all satisfying assignments are found after approximately $7.5 \times 10^6$ samples are provided as input to the RBM, whereas the number of possible assignments is approximately 1 billion, a ratio of sample size to the search space of $0.75\%$. The ratio for M=30, N=10 is even lower at $0.37\%$. Source: Tran_Garcez_2023.
  • ...and 8 more figures

Theorems & Definitions (12)

  • Example 1
  • Definition 1
  • Definition 2
  • Lemma 1
  • Theorem 1
  • Example 2
  • Lemma 2
  • Example 3
  • Example 4
  • Lemma 3
  • ...and 2 more