Polynomial Semantics of Tractable Probabilistic Circuits
Oliver Broadrick, Honghua Zhang, Guy Van den Broeck
TL;DR
The paper addresses how different polynomial encodings of probability distributions in probabilistic circuits relate for distributions over binary variables $X_1,\dots,X_n$. It demonstrates that network $p(x, xbar)$, likelihood $p(x)$, generating $g(x)$, and Fourier hat p(x) representations are polynomial-time transform-equivalent, preserving tractable marginal inference. It leverages Strassen style division-elimination to convert division-containing circuits to division-free forms, enabling efficient marginals across semantics and establishing a unified view of $p(x,xbar)$, $p(x)$, $g(x)$, and $\hat{p}(x)$. The work extends to categorical distributions, showing that for $k \ge 4$ categories inference on PGCs is #P-hard, signaling fundamental limits and guiding future cross-representation methods. Overall, the results unify previously separate marginal-inference approaches and enable learning in one representation to be transformed into others with only polynomial overhead.
Abstract
Probabilistic circuits compute multilinear polynomials that represent multivariate probability distributions. They are tractable models that support efficient marginal inference. However, various polynomial semantics have been considered in the literature (e.g., network polynomials, likelihood polynomials, generating functions, and Fourier transforms). The relationships between circuit representations of these polynomial encodings of distributions is largely unknown. In this paper, we prove that for distributions over binary variables, each of these probabilistic circuit models is equivalent in the sense that any circuit for one of them can be transformed into a circuit for any of the others with only a polynomial increase in size. They are therefore all tractable for marginal inference on the same class of distributions. Finally, we explore the natural extension of one such polynomial semantics, called probabilistic generating circuits, to categorical random variables, and establish that inference becomes #P-hard.
