Table of Contents
Fetching ...

A Fast Convoluted Story: Scaling Probabilistic Inference for Integer Arithmetic

Lennert De Smet, Pedro Zuidberg Dos Martires

TL;DR

This work forms linear arithmetic over integer-valued random variables as tensor manipulations that can be implemented in a straightforward fashion using modern deep learning libraries and shows that tensorising probabilistic linear integer arithmetic and leveraging the fast Fourier transform allows this work to push the state of the art by several orders of magnitude in terms of inference and learning times.

Abstract

As illustrated by the success of integer linear programming, linear integer arithmetic is a powerful tool for modelling combinatorial problems. Furthermore, the probabilistic extension of linear programming has been used to formulate problems in neurosymbolic AI. However, two key problems persist that prevent the adoption of neurosymbolic techniques beyond toy problems. First, probabilistic inference is inherently hard, #P-hard to be precise. Second, the discrete nature of integers renders the construction of meaningful gradients challenging, which is problematic for learning. In order to mitigate these issues, we formulate linear arithmetic over integer-valued random variables as tensor manipulations that can be implemented in a straightforward fashion using modern deep learning libraries. At the core of our formulation lies the observation that the addition of two integer-valued random variables can be performed by adapting the fast Fourier transform to probabilities in the log-domain. By relying on tensor operations we obtain a differentiable data structure, which unlocks, virtually for free, gradient-based learning. In our experimental validation we show that tensorising probabilistic linear integer arithmetic and leveraging the fast Fourier transform allows us to push the state of the art by several orders of magnitude in terms of inference and learning times.

A Fast Convoluted Story: Scaling Probabilistic Inference for Integer Arithmetic

TL;DR

This work forms linear arithmetic over integer-valued random variables as tensor manipulations that can be implemented in a straightforward fashion using modern deep learning libraries and shows that tensorising probabilistic linear integer arithmetic and leveraging the fast Fourier transform allows this work to push the state of the art by several orders of magnitude in terms of inference and learning times.

Abstract

As illustrated by the success of integer linear programming, linear integer arithmetic is a powerful tool for modelling combinatorial problems. Furthermore, the probabilistic extension of linear programming has been used to formulate problems in neurosymbolic AI. However, two key problems persist that prevent the adoption of neurosymbolic techniques beyond toy problems. First, probabilistic inference is inherently hard, #P-hard to be precise. Second, the discrete nature of integers renders the construction of meaningful gradients challenging, which is problematic for learning. In order to mitigate these issues, we formulate linear arithmetic over integer-valued random variables as tensor manipulations that can be implemented in a straightforward fashion using modern deep learning libraries. At the core of our formulation lies the observation that the addition of two integer-valued random variables can be performed by adapting the fast Fourier transform to probabilities in the log-domain. By relying on tensor operations we obtain a differentiable data structure, which unlocks, virtually for free, gradient-based learning. In our experimental validation we show that tensorising probabilistic linear integer arithmetic and leveraging the fast Fourier transform allows us to push the state of the art by several orders of magnitude in terms of inference and learning times.

Paper Structure

This paper contains 21 sections, 31 equations, 10 figures, 1 table, 3 algorithms.

Figures (10)

  • Figure 1: On the left and in the middle we have two histograms representing the probability distributions of the random variables $X_1$ and $X_2$, respectively. The grid on the right represents the joint probability of the two distributions, with more intense colors indicating events with higher probability. The distribution of the random variable $X=X_1+X_2$ can be obtained by summing up the diagonals of the grid as indicated in the figure. While this method of obtaining the distribution for $X$ is valid and used by state-of-the-art neurosymbolic techniques huang2021scallopmanhaeve2018deepproblog, the explicit construction of the joint is unnecessary and hampers inference and learning times (cf. Section \ref{['sec:experiments']}).
  • Figure 1: Probabilistic Luhn algorithm written in Python using our pythonplia library.
  • Figure 2: (Left) Adding a constant to a probabilistic integer simply means that we have to shift the corresponding histogram, shown here for $X'=X+1$. (Middle) For the negation $X'=-X$, the bins of the histogram reverse their order and the negation of the upper bound becomes the new lower bound. (Right) For multiplication, here show the case $X'=3X$ by inserting zero probability bins.
  • Figure 2: Computing the sum of two numbers by explicitly constructing the joint distributions of the resulting digit. pythonPInt is the pythonplia primitive to construct probabilistic integers.
  • Figure 3: (Left) We show the histogram transformation for the integer division $X'= X/3$. The probability mass of three subsequent bins is accumulated in the bins for which $x \bmod 3 = 0$ and $x/3\in \Omega(X)$. (Right) For the modulo $X'=X \bmod 3$, the only non-zero elements of $\Omega(X')$ are elements of the set $\{0,1, 2 \}$. The bins corresponding to these values then accumulate the probability masses of all other bins as indicated by the colors.
  • ...and 5 more figures

Theorems & Definitions (1)

  • Definition 2.1: Probabilistic linear arithmetic expression