Table of Contents
Fetching ...

Optimal Transport for Probabilistic Circuits

Adrian Ciotinga, YooJung Choi

TL;DR

This work tackles the challenge of computing transport-based distances between probabilistic circuits (PCs) by introducing a CW_p distance that restricts couplings to a coupling circuit, enabling exact, quadratic-time computation for compatible PCs. It develops an efficient recursive algorithm to compute CW_p and extract the associated transport plan, and shows that CW_p upper-bounds the true Wasserstein distance while remaining tractable where classical OT solvers fail on PCs. To enable learning, the paper proposes Empirical Circuit Wasserstein (ECW) and an iterative Wasserstein-Minimization (WM) method that alternates between optimizing the coupling and updating PC parameters, offering a practical alternative to maximum-likelihood training. Empirical results on synthetic PCs, MNIST-classified circuits, and a color-transfer task demonstrate CW_p’s scalability, its utility as a proxy for true OT, and the viability of Wasserstein-based PC parameter learning, with stochastic variants improving optimization in larger models.

Abstract

We introduce a novel optimal transport framework for probabilistic circuits (PCs). While it has been shown recently that divergences between distributions represented as certain classes of PCs can be computed tractably, to the best of our knowledge, there is no existing approach to compute the Wasserstein distance between probability distributions given by PCs. We propose a Wasserstein-type distance that restricts the coupling measure of the associated optimal transport problem to be a probabilistic circuit. We then develop an algorithm for computing this distance by solving a series of small linear programs and derive the circuit conditions under which this is tractable. Furthermore, we show that we can easily retrieve the optimal transport plan between the PCs from the solutions to these linear programs. Lastly, we study the empirical Wasserstein distance between a PC and a dataset, and show that we can estimate the PC parameters to minimize this distance through an efficient iterative algorithm.

Optimal Transport for Probabilistic Circuits

TL;DR

This work tackles the challenge of computing transport-based distances between probabilistic circuits (PCs) by introducing a CW_p distance that restricts couplings to a coupling circuit, enabling exact, quadratic-time computation for compatible PCs. It develops an efficient recursive algorithm to compute CW_p and extract the associated transport plan, and shows that CW_p upper-bounds the true Wasserstein distance while remaining tractable where classical OT solvers fail on PCs. To enable learning, the paper proposes Empirical Circuit Wasserstein (ECW) and an iterative Wasserstein-Minimization (WM) method that alternates between optimizing the coupling and updating PC parameters, offering a practical alternative to maximum-likelihood training. Empirical results on synthetic PCs, MNIST-classified circuits, and a color-transfer task demonstrate CW_p’s scalability, its utility as a proxy for true OT, and the viability of Wasserstein-based PC parameter learning, with stochastic variants improving optimization in larger models.

Abstract

We introduce a novel optimal transport framework for probabilistic circuits (PCs). While it has been shown recently that divergences between distributions represented as certain classes of PCs can be computed tractably, to the best of our knowledge, there is no existing approach to compute the Wasserstein distance between probability distributions given by PCs. We propose a Wasserstein-type distance that restricts the coupling measure of the associated optimal transport problem to be a probabilistic circuit. We then develop an algorithm for computing this distance by solving a series of small linear programs and derive the circuit conditions under which this is tractable. Furthermore, we show that we can easily retrieve the optimal transport plan between the PCs from the solutions to these linear programs. Lastly, we study the empirical Wasserstein distance between a PC and a dataset, and show that we can estimate the PC parameters to minimize this distance through an efficient iterative algorithm.

Paper Structure

This paper contains 42 sections, 6 theorems, 17 equations, 11 figures, 1 table, 3 algorithms.

Key Result

Theorem 1

Suppose $P$ and $Q$ are probabilistic circuits over $n$ Boolean variables. Then computing the $\infty$-Wasserstein distance between $P$ and $Q$ is coNP-hard.

Figures (11)

  • Figure 1: Compatible circuits over $\mathbf{X}\!=\!\{X_1,X_2,X_3\}$ and $\mathbf{Y}\!=\!\{Y_1,Y_2,Y_3\}$. Nodes in the same color have same scope, and the scope decomposition is visualized on the right.
  • Figure 2: Recursive construction of coupling circuits. (Top) Product nodes couple children with corresponding scopes. (Bottom) Sum nodes couple the Cartesian product of children, with marginal constraints for the parameters.
  • Figure 3: Runtime of Wasserstein-type distance computation using our approach (blue dots) and the baselines ($\textsf{MW}_1$ red triangles, $\textsf{W}_1$ green squares, and Sinkhorn distance with orange squares). Left: Fixed $k=4$, variable $v$. Right: Fixed $v=2$, variable $k$. Each data point is averaged over 20 runs. See Appendix \ref{['sec:moreruntime']} for more detailed experimental results.
  • Figure 4: Proportion of instances that $\textsf{CW}_1$, $\textsf{MW}_1$, and $\textsf{W}_1$ could be solved without numerical stability or OOM issues. Note that only $\textsf{CW}_1$ could be computed exactly for every circuit pair.
  • Figure 5: Distributions of Kendall correlation coefficients between Cosine Similarity and $-\textsf{CW}_1$ (far left), likelihood (center left), negative Sinkhorn distance computed with $n=1000$ samples (center right), and negative Sinkhorn distance computed with $n=5000$ samples (far right). Higher is better.
  • ...and 6 more figures

Theorems & Definitions (17)

  • Definition 1
  • Theorem 1
  • Definition 2: Circuit compatibility vergari2021atlas
  • Definition 3: Coupling circuit
  • Definition 4: Circuit Wasserstein distance
  • Proposition 1
  • Theorem 2
  • Definition 5: Empirical Circuit Wasserstein distance
  • proof
  • Proposition 2
  • ...and 7 more