AS2 -- Attention-Based Soft Answer Sets: An End-to-End Differentiable Neuro-Soft-Symbolic Reasoning Architecture

Wael AbdAlmageed

AS2 -- Attention-Based Soft Answer Sets: An End-to-End Differentiable Neuro-Soft-Symbolic Reasoning Architecture

Wael AbdAlmageed

Abstract

Neuro-symbolic artificial intelligence (AI) systems typically couple a neural perception module to a discrete symbolic solver through a non-differentiable boundary, preventing constraint-satisfaction feedback from reaching the perception encoder during training. We introduce AS2 (Attention-Based Soft Answer Sets), a fully differentiable neuro-symbolic architecture that replaces the discrete solver with a soft, continuous approximation of the Answer Set Programming (ASP) immediate consequence operator $T_P$. AS2 maintains per-position probability distributions over a finite symbol domain throughout the forward pass and trains end-to-end by minimizing the fixed-point residual of a probabilistic lift of $T_P$, thereby differentiating through the constraint check without invoking an external solver at either training or inference time. The architecture is entirely free of conventional positional embeddings. Instead, it encodes problem structure through constraint-group membership embeddings that directly reflect the declarative ASP specification, making the model agnostic to arbitrary position indexing. On Visual Sudoku, AS2 achieves 99.89% cell accuracy and 100% constraint satisfaction (verified by Clingo) across 1,000 test boards, using a greedy constrained decoding procedure that requires no external solver. On MNIST Addition with $N \in \{2, 4, 8\}$ addends, AS2 achieves digit accuracy above 99.7% across all scales. These results demonstrate that a soft differentiable fixpoint operator, combined with constraint-aware attention and declarative constraint specification, can match or exceed pipeline and solver-based neuro-symbolic systems while maintaining full end-to-end differentiability.

AS2 -- Attention-Based Soft Answer Sets: An End-to-End Differentiable Neuro-Soft-Symbolic Reasoning Architecture

Abstract

. AS2 maintains per-position probability distributions over a finite symbol domain throughout the forward pass and trains end-to-end by minimizing the fixed-point residual of a probabilistic lift of

, thereby differentiating through the constraint check without invoking an external solver at either training or inference time. The architecture is entirely free of conventional positional embeddings. Instead, it encodes problem structure through constraint-group membership embeddings that directly reflect the declarative ASP specification, making the model agnostic to arbitrary position indexing. On Visual Sudoku, AS2 achieves 99.89% cell accuracy and 100% constraint satisfaction (verified by Clingo) across 1,000 test boards, using a greedy constrained decoding procedure that requires no external solver. On MNIST Addition with

addends, AS2 achieves digit accuracy above 99.7% across all scales. These results demonstrate that a soft differentiable fixpoint operator, combined with constraint-aware attention and declarative constraint specification, can match or exceed pipeline and solver-based neuro-symbolic systems while maintaining full end-to-end differentiability.

Paper Structure (35 sections, 6 equations, 2 figures, 3 tables)

This paper contains 35 sections, 6 equations, 2 figures, 3 tables.

Introduction
Related Work
Pipeline Neuro-Symbolic Systems
End-to-End Differentiable Neuro-Symbolic Systems
Perception-Based Constraint Solving
Positional Encodings in Neuro-Symbolic Transformers
Method
Problem Formulation
Probabilistic Lift of the Immediate-Consequence Operator
The $T_P$ Fixed-Point Loss
Architecture
Perception Module.
Evidence Clamping.
Constraint-Group Membership Embeddings (Positional-Embedding-Free Design).
Reasoning Module.
...and 20 more sections

Figures (2)

Figure 1: AS$^2$ architecture. Raw perceptual signals are encoded cell-by-cell by a shared-weight convolutional neural network (CNN) (§\ref{['sec:architecture']}). A concept bottleneck head produces per-position pre-reasoning logits; clue cells are evidence-clamped before the soft distributions are projected to $\mathbb{R}^{d_{\mathrm{model}}}$ and enriched by constraint-group membership embeddings (row, column, and box for Sudoku), replacing conventional positional encodings. A multi-layer Transformer encoder propagates information across all positions jointly. The output head produces post-reasoning logits supervised end-to-end by the differentiable $T_P$ fixed-point loss (Equation \ref{['eq:tp_loss']}), which enforces constraint satisfaction without calling an external solver.
Figure 2: Complete declarative constraint specification for Visual Sudoku in Answer Set Programming (ASP) syntax. Each integrity constraint enforces that every digit appears exactly once in each row, column, and $3{\times}3$ box, respectively. These three rules are compiled automatically into the differentiable $T_P$ residual loss with no manual penalty engineering.

AS2 -- Attention-Based Soft Answer Sets: An End-to-End Differentiable Neuro-Soft-Symbolic Reasoning Architecture

Abstract

AS2 -- Attention-Based Soft Answer Sets: An End-to-End Differentiable Neuro-Soft-Symbolic Reasoning Architecture

Authors

Abstract

Table of Contents

Figures (2)