Table of Contents
Fetching ...

Probabilistic Circuits with Constraints via Convex Optimization

Soroush Ghandi, Benjamin Quost, Cassio de Campos

TL;DR

Empirical evaluations indicate that the combination of constraints and PCs can have multiple use cases, including the improvement of model performance under scarce or incomplete data, as well as the enforcement of machine learning fairness measures into the model without compromising model fitness.

Abstract

This work addresses integrating probabilistic propositional logic constraints into the distribution encoded by a probabilistic circuit (PC). PCs are a class of tractable models that allow efficient computations (such as conditional and marginal probabilities) while achieving state-of-the-art performance in some domains. The proposed approach takes both a PC and constraints as inputs, and outputs a new PC that satisfies the constraints. This is done efficiently via convex optimization without the need to retrain the entire model. Empirical evaluations indicate that the combination of constraints and PCs can have multiple use cases, including the improvement of model performance under scarce or incomplete data, as well as the enforcement of machine learning fairness measures into the model without compromising model fitness. We believe that these ideas will open possibilities for multiple other applications involving the combination of logics and deep probabilistic models.

Probabilistic Circuits with Constraints via Convex Optimization

TL;DR

Empirical evaluations indicate that the combination of constraints and PCs can have multiple use cases, including the improvement of model performance under scarce or incomplete data, as well as the enforcement of machine learning fairness measures into the model without compromising model fitness.

Abstract

This work addresses integrating probabilistic propositional logic constraints into the distribution encoded by a probabilistic circuit (PC). PCs are a class of tractable models that allow efficient computations (such as conditional and marginal probabilities) while achieving state-of-the-art performance in some domains. The proposed approach takes both a PC and constraints as inputs, and outputs a new PC that satisfies the constraints. This is done efficiently via convex optimization without the need to retrain the entire model. Empirical evaluations indicate that the combination of constraints and PCs can have multiple use cases, including the improvement of model performance under scarce or incomplete data, as well as the enforcement of machine learning fairness measures into the model without compromising model fitness. We believe that these ideas will open possibilities for multiple other applications involving the combination of logics and deep probabilistic models.
Paper Structure (16 sections, 2 theorems, 14 equations, 6 figures, 2 tables)

This paper contains 16 sections, 2 theorems, 14 equations, 6 figures, 2 tables.

Key Result

theorem thmcountertheorem

Assume a PC representing a distribution $p(\mathbf{X})$ as in Equation eq:mixture_ppl and PPL constraints as in Equation eq:cform (placed in disjoint buckets $B$) are given. Assume that $q(\mathbf{X})$ is a distribution induced by a PC with form as in Equation eq:mixture_q. Then, $H(p(\mathbf{X}),q(

Figures (6)

  • Figure 1: Example of PC with variables $X_1,\ldots,X_3$. Sum nodes are in blue, product nodes in green, distribution leaf nodes in salmon. In this example, all leaf nodes are univariate. Subscriptions on each $p$ in the figure are used to indicate that those are different leaf distributions (even if sometimes over the same variable).
  • Figure 2: Leaf distribution replacement structures that can be used to represent the parameters of a categorical variable for a bucket $B$ with $\mathbf{X}_{B} = \{X_{1}, X_{2}\}$.
  • Figure 3: LearnSPN vs. (constrained) PPL-LSPN trained on scarce datasets.
  • Figure 4: RAT-SPN vs. (constrained) PPL-RSPN trained on scarce datasets.
  • Figure 5: Sum of quadratic differences on marginal parameters between the models with and without marginal constraints, when trained on scarce data. Constraints clearly refine the model more strongly for RAT-SPNs than for LearnSPN. Standard RAT-SPN marginals are very far from matching the empirical marginal distributions (data not shown).
  • ...and 1 more figures

Theorems & Definitions (4)

  • theorem thmcountertheorem
  • proof
  • theorem thmcountertheorem
  • proof