On Faster Marginalization with Squared Circuits via Orthonormalization
Lorenzo Loconte, Antonio Vergari
TL;DR
The paper addresses the high cost of marginalization in squared circuits by introducing an orthonormal parameterization inspired by tensor-network canonical forms, which ensures the squared circuit is already normalized (i.e., $Z=1$). It develops sufficient conditions using semi-unitary, orthonormal input functions and a QR-based orthonormalization procedure, enabling a Marginalize algorithm with improved complexity $O(|\phi_{\mathbf{Y}}| S + |\phi_{\mathbf{Y},\mathbf{Z}}| S^2)$ for computing marginals. The key contributions include (i) a principled parameterization that preserves expressiveness for many circuit classes, and (ii) a faster, structurally aware marginalization method, along with a procedure to convert non-orthonormal circuits into orthonormal ones without loss of distributional power. These advances potentially broaden the applicability of squared circuits to tasks requiring fast marginalization, such as lossless compression and probabilistic reasoning in deep learning systems, by enabling efficient exact inference with normalized distributions.
Abstract
Squared tensor networks (TNs) and their generalization as parameterized computational graphs -- squared circuits -- have been recently used as expressive distribution estimators in high dimensions. However, the squaring operation introduces additional complexity when marginalizing variables or computing the partition function, which hinders their usage in machine learning applications. Canonical forms of popular TNs are parameterized via unitary matrices as to simplify the computation of particular marginals, but cannot be mapped to general circuits since these might not correspond to a known TN. Inspired by TN canonical forms, we show how to parameterize squared circuits to ensure they encode already normalized distributions. We then use this parameterization to devise an algorithm to compute any marginal of squared circuits that is more efficient than a previously known one. We conclude by formally showing the proposed parameterization comes with no expressiveness loss for many circuit classes.
