Sum of Squares Circuits
Lorenzo Loconte, Stefan Mengel, Antonio Vergari
TL;DR
This work investigates the expressiveness of probabilistic circuits (PCs) under tractable inference and introduces Sum of Compatible Squares (SOCS) as a powerful non-monotonic extension. It establishes an expressiveness hierarchy showing that monotonic PCs can outperform squared PCs, while SOCS can exponentially surpass both, unifying models such as PSD, Born Machines, and Inception PCs under a common SOCS framework. The authors provide two constructive separations (UPS and UTQ) and show that complex parameters further enhance expressiveness within SOCS, with empirical validation on distribution estimation across tabular and image data. The results suggest SOCS as a scalable, expressive, and practical approach for tractable probabilistic modeling, with potential connections to SOS polynomials and broader non-negative representations.
Abstract
Designing expressive generative models that support exact and efficient inference is a core question in probabilistic ML. Probabilistic circuits (PCs) offer a framework where this tractability-vs-expressiveness trade-off can be analyzed theoretically. Recently, squared PCs encoding subtractive mixtures via negative parameters have emerged as tractable models that can be exponentially more expressive than monotonic PCs, i.e., PCs with positive parameters only. In this paper, we provide a more precise theoretical characterization of the expressiveness relationships among these models. First, we prove that squared PCs can be less expressive than monotonic ones. Second, we formalize a novel class of PCs -- sum of squares PCs -- that can be exponentially more expressive than both squared and monotonic PCs. Around sum of squares PCs, we build an expressiveness hierarchy that allows us to precisely unify and separate different tractable model classes such as Born Machines and PSD models, and other recently introduced tractable probabilistic models by using complex parameters. Finally, we empirically show the effectiveness of sum of squares circuits in performing distribution estimation.
