Table of Contents
Fetching ...

On the Relationship Between Monotone and Squared Probabilistic Circuits

Benjie Wang, Guy Van den Broeck

TL;DR

The paper investigates two tractable probabilistic circuit paradigms—monotone PCs with non-negative weights and squared PCs with potentially negative weights whose square defines a density—and proves that neither dominates the other in expressive efficiency. It then introduces Inception PCs (IncPCs), a unified model that harnesses complex parameters and a latent-variable interpretation to realize deep sums-of-square-of-sums, and provides a tractable materialization of IncPCs into standard PCs for inference. A tensorized IncPC architecture is proposed for scalable training on GPUs, and the authors demonstrate that IncPCs outperform both monotone and squared PCs on tabular and image benchmarks, while preserving tractable marginal queries. The work thus offers a theoretically and practically advantageous framework for density modeling that reconciles prior approaches and expands expressive capacity for large-scale data.

Abstract

Probabilistic circuits are a unifying representation of functions as computation graphs of weighted sums and products. Their primary application is in probabilistic modeling, where circuits with non-negative weights (monotone circuits) can be used to represent and learn density/mass functions, with tractable marginal inference. Recently, it was proposed to instead represent densities as the square of the circuit function (squared circuits); this allows the use of negative weights while retaining tractability, and can be exponentially more expressive efficient than monotone circuits. Unfortunately, we show the reverse also holds, meaning that monotone circuits and squared circuits are incomparable in general. This raises the question of whether we can reconcile, and indeed improve upon the two modeling approaches. We answer in the positive by proposing Inception PCs, a novel type of circuit that naturally encompasses both monotone circuits and squared circuits as special cases, and employs complex parameters. Empirically, we validate that Inception PCs can outperform both monotone and squared circuits on a range of tabular and image datasets.

On the Relationship Between Monotone and Squared Probabilistic Circuits

TL;DR

The paper investigates two tractable probabilistic circuit paradigms—monotone PCs with non-negative weights and squared PCs with potentially negative weights whose square defines a density—and proves that neither dominates the other in expressive efficiency. It then introduces Inception PCs (IncPCs), a unified model that harnesses complex parameters and a latent-variable interpretation to realize deep sums-of-square-of-sums, and provides a tractable materialization of IncPCs into standard PCs for inference. A tensorized IncPC architecture is proposed for scalable training on GPUs, and the authors demonstrate that IncPCs outperform both monotone and squared PCs on tabular and image benchmarks, while preserving tractable marginal queries. The work thus offers a theoretically and practically advantageous framework for density modeling that reconciles prior approaches and expands expressive capacity for large-scale data.

Abstract

Probabilistic circuits are a unifying representation of functions as computation graphs of weighted sums and products. Their primary application is in probabilistic modeling, where circuits with non-negative weights (monotone circuits) can be used to represent and learn density/mass functions, with tractable marginal inference. Recently, it was proposed to instead represent densities as the square of the circuit function (squared circuits); this allows the use of negative weights while retaining tractability, and can be exponentially more expressive efficient than monotone circuits. Unfortunately, we show the reverse also holds, meaning that monotone circuits and squared circuits are incomparable in general. This raises the question of whether we can reconcile, and indeed improve upon the two modeling approaches. We answer in the positive by proposing Inception PCs, a novel type of circuit that naturally encompasses both monotone circuits and squared circuits as special cases, and employs complex parameters. Empirically, we validate that Inception PCs can outperform both monotone and squared circuits on a range of tabular and image datasets.
Paper Structure (25 sections, 12 theorems, 25 equations, 4 figures, 4 tables, 3 algorithms)

This paper contains 25 sections, 12 theorems, 25 equations, 4 figures, 4 tables, 3 algorithms.

Key Result

Theorem 1

LoconteICLR24 There exists a class of non-negative functions $p(\bm{V})$ such that there exist structured-decomposable PCs $\mathcal{C}$ with $p(\bm{V}) = f_{\mathcal{C}}(\bm{V})^2$ of size polynomial in $|\bm{V}|$, but the smallest structured-decomposable monotone PC $\mathcal{C}'$ such that $p(\bm

Figures (4)

  • Figure 1: Latent variable interpretation for squaring PCs. The sum node in Figure \ref{['fig:lvi']} has two children with complex weights and associated with different values of the latent $Z$. A sum-of-squares (Figure \ref{['fig:square_before_sum']}) gives a monotone PC, where the parameters necessarily become non-negative. A square-of-sums (Figure \ref{['fig:sum_before_square']}) leads to a squared PC, with four children each corresponding to the product of any two children from the original circuit.
  • Figure 2: Diagrams showing an Inception PC, and the corresponding materialized IncPC. Each product node is labelled with an index, such that e.g. $\times_{34}$ is the product of the product nodes $\times_3, \times_4$. For clarity, the children of product nodes have been omitted (except the latent indicators), and edge weights for the materialized Inception PC are displayed below the corresponding child.
  • Figure 3: Illustration of tensorized Inception PC sum region, where $N_1 = 3$, $N_2 = 2$.
  • Figure 4: Test bpd for a range of configurations of $(N_1, N_2)$ on ImageNet32 (lower is better); configurations are limited to $2^{18} = 262K$ FLOPS per region. The Inception PC with $N_1=32, N_2=4$ achieves the best performance.

Theorems & Definitions (25)

  • Definition 1: Probabilistic Circuit
  • Definition 2: Smoothness, Decomposability
  • Definition 3: Structured Decomposability
  • Definition 4: Monotone PC
  • Theorem 1
  • Theorem 2
  • proof
  • Proposition 1: Tractability of Complex Conjugation
  • Definition 5: Inception PC
  • Theorem 3: Tractability of InceptionPC
  • ...and 15 more