Table of Contents
Fetching ...

Probabilistic Generating Circuits -- Demystified

Sanyam Agarwal, Markus Bläser

TL;DR

This work revisits probabilistic generating circuits (PGCs) and shows that their apparent advantage over probabilistic circuits (PCs) stems from the allowance of negative weights rather than a fundamental representational novelty. It proves that any binary PGC can be converted to a PC with negative weights computing the same distribution, thereby equating PGCs with nonmonotone PCs in this regime. For categorial variables with larger alphabets, the paper proves hardness results: efficient marginalization is unlikely unless major complexity-theoretic collapses occur, yet modeling such variables via PCs with negative weights that compute set-multilinear polynomials yields tractable marginalization. The authors further provide a suite of compositional operations preserving distributional correctness, and establish deep connections between nonmonotone PCs and DPPs, including results that suggest a close but unresolved separation between these formalisms. Overall, the work clarifies the power and limitations of PGCs, situating them within the landscape of tractable probabilistic models and highlighting the pivotal role of negative weights and set-multilinearity.

Abstract

Zhang et al. (ICML 2021, PLMR 139, pp. 12447-1245) introduced probabilistic generating circuits (PGCs) as a probabilistic model to unify probabilistic circuits (PCs) and determinantal point processes (DPPs). At a first glance, PGCs store a distribution in a very different way, they compute the probability generating polynomial instead of the probability mass function and it seems that this is the main reason why PGCs are more powerful than PCs or DPPs. However, PGCs also allow for negative weights, whereas classical PCs assume that all weights are nonnegative. One of the main insights of our paper is that the negative weights are responsible for the power of PGCs and not the different representation. PGCs are PCs in disguise, in particular, we show how to transform any PGC into a PC with negative weights with only polynomial blowup. PGCs were defined by Zhang et al. only for binary random variables. As our second main result, we show that there is a good reason for this: we prove that PGCs for categorial variables with larger image size do not support tractable marginalization unless NP = P. On the other hand, we show that we can model categorial variables with larger image size as PC with negative weights computing set-multilinear polynomials. These allow for tractable marginalization. In this sense, PCs with negative weights strictly subsume PGCs.

Probabilistic Generating Circuits -- Demystified

TL;DR

This work revisits probabilistic generating circuits (PGCs) and shows that their apparent advantage over probabilistic circuits (PCs) stems from the allowance of negative weights rather than a fundamental representational novelty. It proves that any binary PGC can be converted to a PC with negative weights computing the same distribution, thereby equating PGCs with nonmonotone PCs in this regime. For categorial variables with larger alphabets, the paper proves hardness results: efficient marginalization is unlikely unless major complexity-theoretic collapses occur, yet modeling such variables via PCs with negative weights that compute set-multilinear polynomials yields tractable marginalization. The authors further provide a suite of compositional operations preserving distributional correctness, and establish deep connections between nonmonotone PCs and DPPs, including results that suggest a close but unresolved separation between these formalisms. Overall, the work clarifies the power and limitations of PGCs, situating them within the landscape of tractable probabilistic models and highlighting the pivotal role of negative weights and set-multilinearity.

Abstract

Zhang et al. (ICML 2021, PLMR 139, pp. 12447-1245) introduced probabilistic generating circuits (PGCs) as a probabilistic model to unify probabilistic circuits (PCs) and determinantal point processes (DPPs). At a first glance, PGCs store a distribution in a very different way, they compute the probability generating polynomial instead of the probability mass function and it seems that this is the main reason why PGCs are more powerful than PCs or DPPs. However, PGCs also allow for negative weights, whereas classical PCs assume that all weights are nonnegative. One of the main insights of our paper is that the negative weights are responsible for the power of PGCs and not the different representation. PGCs are PCs in disguise, in particular, we show how to transform any PGC into a PC with negative weights with only polynomial blowup. PGCs were defined by Zhang et al. only for binary random variables. As our second main result, we show that there is a good reason for this: we prove that PGCs for categorial variables with larger image size do not support tractable marginalization unless NP = P. On the other hand, we show that we can model categorial variables with larger image size as PC with negative weights computing set-multilinear polynomials. These allow for tractable marginalization. In this sense, PCs with negative weights strictly subsume PGCs.
Paper Structure (11 sections, 12 theorems, 19 equations, 10 figures)

This paper contains 11 sections, 12 theorems, 19 equations, 10 figures.

Key Result

Theorem 6.1

Counting perfect matchings in bipartite graphs is $\mathsf{\#P}$-complete under Turing reductions.

Figures (10)

  • Figure 1: A distribution over binary random variables
  • Figure 2: A PC over binary random variables, computing the distribution in Figure \ref{['fig:distribution']}
  • Figure 3: Example of a PGC computing the probability generating function of the distribution in Figure \ref{['fig:distribution']}
  • Figure 4: A $3$-regular bipartite graph with bipartition $U = \{u_1,u_2,u_3,u_4\}$ and $V = \{v_1,v_2,v_3,v_4\}$.
  • Figure 5: The thick edges form a perfect matching. Any subset of it forms a matching.
  • ...and 5 more figures

Theorems & Definitions (37)

  • Example 2.1
  • Example 2.2
  • Example 5.1
  • Example 5.2
  • Theorem 6.1: Valiant
  • Theorem 7.1
  • proof
  • Definition 7.2: Selective Marginalization
  • Theorem 7.3
  • Definition 7.4: $(2,3)$-regular bipartite graph
  • ...and 27 more