Probabilistic Neural Circuits

Pedro Zuidberg Dos Martires

Probabilistic Neural Circuits

Pedro Zuidberg Dos Martires

TL;DR

The paper addresses the trade-off between tractability and expressivity in probabilistic models by introducing probabilistic neural circuits (PNCs), a conditional probabilistic circuit formalism where neural networks parameterize select weights. PNCs are interpreted as deep mixtures of Bayesian networks, implemented as layered, trainable structures with neural sum layers that relax decomposability while preserving tractable queries. Empirically, PNCs improve density estimation on MNIST-family datasets and outperform several state-of-the-art models, though discriminative performance requires regularization. The work provides a principled middle ground between probabilistic circuits and neural nets, with potential for improved sampling, structure learning, and broader applications beyond image data.

Abstract

Probabilistic circuits (PCs) have gained prominence in recent years as a versatile framework for discussing probabilistic models that support tractable queries and are yet expressive enough to model complex probability distributions. Nevertheless, tractability comes at a cost: PCs are less expressive than neural networks. In this paper we introduce probabilistic neural circuits (PNCs), which strike a balance between PCs and neural nets in terms of tractability and expressive power. Theoretically, we show that PNCs can be interpreted as deep mixtures of Bayesian networks. Experimentally, we demonstrate that PNCs constitute powerful function approximators.

Probabilistic Neural Circuits

TL;DR

Abstract

Paper Structure (15 sections, 8 theorems, 11 equations, 6 figures, 2 tables, 1 algorithm)

This paper contains 15 sections, 8 theorems, 11 equations, 6 figures, 2 tables, 1 algorithm.

Introduction
Preliminaries
Conditional Probabilistic Circuits
Partially Ordered Random Variables
Deep Mixtures of Bayesian Networks
PNCs and Their Tractable Queries
Layered Probabilistic Neural Circuits
Structure for One-Dimensional Data
Implementation Using Convolutions
Related Work
Experimental Evaluation
How Do PNCs Fair Against PQCs and PCs?
How Do PNCs Fair Against State of the Art?
Can PNCs Perform Discriminative Learning?
Conclusions & Future Work

Key Result

Corollary 3.4

A conditional probabilistic circuit over an unordered set of random variables is a (non-conditional) probability circuit.

Figures (6)

Figure 1: Layered probabilistic circuit following the construction of shih2021hyperspns. Data (modeled as random variables) is first fed into the leaf layer at the bottom. The output of the leaf layer is a mixture of distributions produced by the sum units. In the sum-product layer (Layer 2) mixtures of random variables are combined by taking pairwise products, these are then again mixed using sum units. Finally, the root layer (at the top) gives us the joint probability distribution. The red edges indicate functional dependencies not present in traditional probabilistic circuits but present in probabilistic neural circuits.
Figure 2: Right: Bayesian network. Left: partial order relations that hold.
Figure 3: A balanced partition tree of a probabilistic neural circuit with eight variables. The partition tree (in black) describes how the variables decompose (in terms of the scope function). The edges in red indicate functional (neural) dependencies between partitions.
Figure 4: Detailed graphical representation of neural dependencies in a PNC. The sum unit at the top outputs the weighted sum of the three product units at the bottom right. The weights for the sum are the outputs of a neural network for which it holds that $\sum_{i=1}^3=1$. They are computed using a neural network that takes as input the values of the six product units at the bottom left.
Figure 5: Graphical representation of half kernels used for neural sum layers in layered PNCs. On the left we see a kernel used for one-dimensional data while on the right we have a $3\times3$ kernel for two-dimensional data. The gray blocks indicate the learnable parameters of the half kernels, while a white square indicates a parameter fixed to zero. Effectively, the convolutional layer is blind with regard to the inputs for these zero elements of the kernel.
...and 1 more figures

Theorems & Definitions (21)

Definition 2.1: Probabilistic Circuit
Definition 2.2: Scope
Definition 2.3: Smoothness
Definition 2.4: Decomposability
Definition 2.5: Valid Probabilsitic Circuit
Example 3.1: Bayesian Network
Definition 3.2: Conditinal Probabilsitic Circuit (CPC)
Definition 3.3: Scope (CPC)
Corollary 3.4
Definition 3.5: Conditional Smoothness
...and 11 more

Probabilistic Neural Circuits

TL;DR

Abstract

Probabilistic Neural Circuits

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (6)

Theorems & Definitions (21)