Table of Contents
Fetching ...

Soft Learning Probabilistic Circuits

Soroush Ghandi, Benjamin Quost, Cassio de Campos

TL;DR

This work identifies a misalignment between learning-time clustering and inference-time querying in probabilistic circuits, highlighting limitations of the greedy LearnSPN approach. It introduces SoftLearn, a soft-clustering-based learning procedure that propagates datapoint weights through all paths, aiming to produce smoother marginals and better likelihoods. Empirical results across binary, mixed, and image data show SoftLearn often outperforms LearnSPN in test likelihood and sample quality, while remaining a robust and versatile base model for further tractable models. The findings advocate for learning–inference compatible structure learning in PCs and point to future enhancements such as pruning and merging to further improve performance and efficiency.

Abstract

Probabilistic Circuits (PCs) are prominent tractable probabilistic models, allowing for a range of exact inferences. This paper focuses on the main algorithm for training PCs, LearnSPN, a gold standard due to its efficiency, performance, and ease of use, in particular for tabular data. We show that LearnSPN is a greedy likelihood maximizer under mild assumptions. While inferences in PCs may use the entire circuit structure for processing queries, LearnSPN applies a hard method for learning them, propagating at each sum node a data point through one and only one of the children/edges as in a hard clustering process. We propose a new learning procedure named SoftLearn, that induces a PC using a soft clustering process. We investigate the effect of this learning-inference compatibility in PCs. Our experiments show that SoftLearn outperforms LearnSPN in many situations, yielding better likelihoods and arguably better samples. We also analyze comparable tractable models to highlight the differences between soft/hard learning and model querying.

Soft Learning Probabilistic Circuits

TL;DR

This work identifies a misalignment between learning-time clustering and inference-time querying in probabilistic circuits, highlighting limitations of the greedy LearnSPN approach. It introduces SoftLearn, a soft-clustering-based learning procedure that propagates datapoint weights through all paths, aiming to produce smoother marginals and better likelihoods. Empirical results across binary, mixed, and image data show SoftLearn often outperforms LearnSPN in test likelihood and sample quality, while remaining a robust and versatile base model for further tractable models. The findings advocate for learning–inference compatible structure learning in PCs and point to future enhancements such as pruning and merging to further improve performance and efficiency.

Abstract

Probabilistic Circuits (PCs) are prominent tractable probabilistic models, allowing for a range of exact inferences. This paper focuses on the main algorithm for training PCs, LearnSPN, a gold standard due to its efficiency, performance, and ease of use, in particular for tabular data. We show that LearnSPN is a greedy likelihood maximizer under mild assumptions. While inferences in PCs may use the entire circuit structure for processing queries, LearnSPN applies a hard method for learning them, propagating at each sum node a data point through one and only one of the children/edges as in a hard clustering process. We propose a new learning procedure named SoftLearn, that induces a PC using a soft clustering process. We investigate the effect of this learning-inference compatibility in PCs. Our experiments show that SoftLearn outperforms LearnSPN in many situations, yielding better likelihoods and arguably better samples. We also analyze comparable tractable models to highlight the differences between soft/hard learning and model querying.
Paper Structure (19 sections, 6 equations, 4 figures, 10 tables, 2 algorithms)

This paper contains 19 sections, 6 equations, 4 figures, 10 tables, 2 algorithms.

Figures (4)

  • Figure 1: PC structure equivalent to Expression \ref{['eq:genbadclus']}, with a root sum node (in blue) with balanced weights to its children, which are two product nodes (in green), and four leaf distribution nodes (in salmon).
  • Figure 2: Green points $(X,Y)$ (resp. horizontal and vertical axes as usual) generated from the PC in Expression \ref{['eq:genbadclus']}. The gray line is a hypothetical bad partition obtained for the root node. SoftLearn yields the mean parameters of the Gaussian leafs represented by the black diamonds, which still captures the whole Gaussians on both sides of the gray cut of the first step; standard LearnSPN yields the red triangles as means, as both clusters are necessarily treated separately.
  • Figure 3: Samples from PCs trained on Binary MNIST (n.9 and n.5), using (left) LearnSPN and (right) SoftLearn.
  • Figure 4: Caption