Deep Symbolic Learning: Discovering Symbols and Rules from Perceptions

Alessandro Daniele; Tommaso Campari; Sagar Malhotra; Luciano Serafini

Deep Symbolic Learning: Discovering Symbols and Rules from Perceptions

Alessandro Daniele, Tommaso Campari, Sagar Malhotra, Luciano Serafini

TL;DR

Deep Symbolic Learning (DSL) presents a Neuro-Symbolic framework that jointly learns perception functions and internal symbolic representations, addressing the symbol grounding problem within a fully differentiable pipeline. It introduces policy-based discrete symbol selection to embed interpretable symbols and learn symbolic rules from data, supervised only on the composed NeSy function. DSL supports both direct and recurrent NeSy-functions and learns the symbolic function g via a learnable weight tensor, enabling gradient flow through discrete choices. Empirical results on MNIST-based tasks (sum, parity, and multi-digit sum) demonstrate competitive accuracy, effective symbol grounding, and strong generalization to longer sequences and transfer to related tasks, with scalable inference for very long inputs. This approach offers a unified, end-to-end method for discovering and grounding symbols and rules from perception without heavy prior biases on the symbolic structure.

Abstract

Neuro-Symbolic (NeSy) integration combines symbolic reasoning with Neural Networks (NNs) for tasks requiring perception and reasoning. Most NeSy systems rely on continuous relaxation of logical knowledge, and no discrete decisions are made within the model pipeline. Furthermore, these methods assume that the symbolic rules are given. In this paper, we propose Deep Symbolic Learning (DSL), a NeSy system that learns NeSy-functions, i.e., the composition of a (set of) perception functions which map continuous data to discrete symbols, and a symbolic function over the set of symbols. DSL learns simultaneously the perception and symbolic functions while being trained only on their composition (NeSy-function). The key novelty of DSL is that it can create internal (interpretable) symbolic representations and map them to perception inputs within a differentiable NN learning pipeline. The created symbols are automatically selected to generate symbolic functions that best explain the data. We provide experimental analysis to substantiate the efficacy of DSL in simultaneously learning perception and symbolic functions.

Deep Symbolic Learning: Discovering Symbols and Rules from Perceptions

TL;DR

Abstract

Paper Structure (24 sections, 17 equations, 5 figures, 2 tables)

This paper contains 24 sections, 17 equations, 5 figures, 2 tables.

Introduction
Related Works
Background
Notation.
Fuzzy Logic.
Problem Definition.
Method
Policy Functions.
DSL for Direct NeSy-functions.
Learning the Perception Functions.
DSL for Recurrent NeSy-functions.
Learning Symbolic Functions.
Gradient Analysis for the Greedy Policy.
Experiments
Evaluation.
...and 9 more sections

Figures (5)

Figure 1: Architecture of Deep Symbolic Learning for the Sum task. Red arrows represent the backward signal during learning.
Figure 2: Architecture of Deep Symbolic Learning for the simple recurrent NeSy functions.
Figure 3: Tensor ${\bf W}$ is used by the policy to generate tensor ${\bf G}$. This is done by applying the policy on the output dimension (vertical axes in the image), selecting a single output element for each pair of symbols $(s_1, s_2) \in \mathcal{S} _1 \times \mathcal{S} _2$.
Figure 4: Confusion matrix for the MNIST digits: (left) before the permutation; (right) after permutation.
Figure 5: DSL learns symbolic representations for images, but these representations may not align with our understanding of symbols for digits. To address this, a bijective mapping is applied to connect the DSL symbols to the commonly understood human notion of digits. DSL performs summation correctly in its learned symbolic representation. E.g., in the confusion matrix (a), symbol $s_0$ corresponds to digit $4$, and symbol $s_{1}$ corresponds to digit $7$. The learned summation rule for $s_0 + s_1$ can be found in matrix $\mathbf{G}$ (c) in position $(0,1)$, i.e., $\mathbf{G}[0,1] = 11$, which is the correct value for the summation of $4$ and $7$. Evaluating the model's ability to learn symbolic rules becomes easier if we apply the right permutation of the learned symbols. As an example, the confusion matrix in (b) is obtained by permuting the symbols in the x-axis in such a way that the matrix becomes diagonal. The same permutation is applied in (d) to both rows and columns, obtaining a human interpretable rule matrix.

Theorems & Definitions (7)

Definition 1: Direct NeSy function
Example 1: Sum task
Definition 2: Simple Recurrent NeSy-function
Example 2: Visual Parity
Example 3: Multi-digit Sum
Example 4: Example 1 continued
Example 5: MNIST Multiop task

Deep Symbolic Learning: Discovering Symbols and Rules from Perceptions

TL;DR

Abstract

Deep Symbolic Learning: Discovering Symbols and Rules from Perceptions

Authors

TL;DR

Abstract

Table of Contents

Figures (5)

Theorems & Definitions (7)