Table of Contents
Fetching ...

The Gradient of Algebraic Model Counting

Jaron Maene, Luc De Raedt

TL;DR

The paper introduces $\nabla \text{AMC}$, a generalization of gradients to algebraic model counting over arbitrary semirings, enabling learning in statistical-relational and neurosymbolic settings. It presents an optimized algebraic backpropagation algorithm that runs in linear time with respect to circuit size by leveraging semiring properties like cancellation and ordering, and demonstrates substantial speed-ups over standard DL frameworks when computing gradients. By specializing to various semirings (e.g., Prob, Log, Viterbi, Grad, Bool with sampling), it shows how a wide range of learning signals, from probability and entropy to maximum-model objectives, can be obtained within a single framework. Theoretical results indicate second-order optimization is generally not linear-time feasible on tractable circuits, but first-order algebraic gradients remain practical, with Kompyle delivering strong empirical performance. This work provides a unified, scalable pathway for differentiable reasoning in probabilistic logic and neurosymbolic AI, linking inference structure, algebra, and learning in a single formalism.

Abstract

Algebraic model counting unifies many inference tasks on logic formulas by exploiting semirings. Rather than focusing on inference, we consider learning, especially in statistical-relational and neurosymbolic AI, which combine logical, probabilistic and neural representations. Concretely, we show that the very same semiring perspective of algebraic model counting also applies to learning. This allows us to unify various learning algorithms by generalizing gradients and backpropagation to different semirings. Furthermore, we show how cancellation and ordering properties of a semiring can be exploited for more memory-efficient backpropagation. This allows us to obtain some interesting variations of state-of-the-art gradient-based optimisation methods for probabilistic logical models. We also discuss why algebraic model counting on tractable circuits does not lead to more efficient second-order optimization. Empirically, our algebraic backpropagation exhibits considerable speed-ups as compared to existing approaches.

The Gradient of Algebraic Model Counting

TL;DR

The paper introduces , a generalization of gradients to algebraic model counting over arbitrary semirings, enabling learning in statistical-relational and neurosymbolic settings. It presents an optimized algebraic backpropagation algorithm that runs in linear time with respect to circuit size by leveraging semiring properties like cancellation and ordering, and demonstrates substantial speed-ups over standard DL frameworks when computing gradients. By specializing to various semirings (e.g., Prob, Log, Viterbi, Grad, Bool with sampling), it shows how a wide range of learning signals, from probability and entropy to maximum-model objectives, can be obtained within a single framework. Theoretical results indicate second-order optimization is generally not linear-time feasible on tractable circuits, but first-order algebraic gradients remain practical, with Kompyle delivering strong empirical performance. This work provides a unified, scalable pathway for differentiable reasoning in probabilistic logic and neurosymbolic AI, linking inference structure, algebra, and learning in a single formalism.

Abstract

Algebraic model counting unifies many inference tasks on logic formulas by exploiting semirings. Rather than focusing on inference, we consider learning, especially in statistical-relational and neurosymbolic AI, which combine logical, probabilistic and neural representations. Concretely, we show that the very same semiring perspective of algebraic model counting also applies to learning. This allows us to unify various learning algorithms by generalizing gradients and backpropagation to different semirings. Furthermore, we show how cancellation and ordering properties of a semiring can be exploited for more memory-efficient backpropagation. This allows us to obtain some interesting variations of state-of-the-art gradient-based optimisation methods for probabilistic logical models. We also discuss why algebraic model counting on tractable circuits does not lead to more efficient second-order optimization. Empirically, our algebraic backpropagation exhibits considerable speed-ups as compared to existing approaches.

Paper Structure

This paper contains 28 sections, 11 theorems, 36 equations, 2 tables, 2 algorithms.

Key Result

Theorem 1

Every derivation $\delta$ is a linear combination of the elements in $\nabla \text{AMC}$. More formally, $\nabla \text{AMC}$ is a basis of the $\mathcal{F}_\mathcal{V}$-semimodule over $\mathcal{D}(\mathcal{F}_\mathcal{V})$.

Theorems & Definitions (32)

  • Definition 2.1: Commutative Monoid
  • Definition 2.2: Commutative Semiring
  • Definition 2.3
  • Definition 2.4: Algebraic Model Counting
  • Example 1
  • Definition 2.5: Boolean Circuit
  • Example 2
  • Definition 3.1
  • Example 3
  • Definition 3.2: Semiring Derivation
  • ...and 22 more