Table of Contents
Fetching ...

Tractable Probabilistic Graph Representation Learning with Graph-Induced Sum-Product Networks

Federico Errica, Mathias Niepert

TL;DR

This work introduces Graph-Induced Sum-Product Networks (GSPNs), a new probabilistic framework for graph representation learning that can tractably answer probabilistic queries and shows the model's competitiveness on scarce supervision scenarios, under missing data, and for graph classification in comparison to popular neural models.

Abstract

We introduce Graph-Induced Sum-Product Networks (GSPNs), a new probabilistic framework for graph representation learning that can tractably answer probabilistic queries. Inspired by the computational trees induced by vertices in the context of message-passing neural networks, we build hierarchies of sum-product networks (SPNs) where the parameters of a parent SPN are learnable transformations of the a-posterior mixing probabilities of its children's sum units. Due to weight sharing and the tree-shaped computation graphs of GSPNs, we obtain the efficiency and efficacy of deep graph networks with the additional advantages of a probabilistic model. We show the model's competitiveness on scarce supervision scenarios, under missing data, and for graph classification in comparison to popular neural models. We complement the experiments with qualitative analyses on hyper-parameters and the model's ability to answer probabilistic queries.

Tractable Probabilistic Graph Representation Learning with Graph-Induced Sum-Product Networks

TL;DR

This work introduces Graph-Induced Sum-Product Networks (GSPNs), a new probabilistic framework for graph representation learning that can tractably answer probabilistic queries and shows the model's competitiveness on scarce supervision scenarios, under missing data, and for graph classification in comparison to popular neural models.

Abstract

We introduce Graph-Induced Sum-Product Networks (GSPNs), a new probabilistic framework for graph representation learning that can tractably answer probabilistic queries. Inspired by the computational trees induced by vertices in the context of message-passing neural networks, we build hierarchies of sum-product networks (SPNs) where the parameters of a parent SPN are learnable transformations of the a-posterior mixing probabilities of its children's sum units. Due to weight sharing and the tree-shaped computation graphs of GSPNs, we obtain the efficiency and efficacy of deep graph networks with the additional advantages of a probabilistic model. We show the model's competitiveness on scarce supervision scenarios, under missing data, and for graph classification in comparison to popular neural models. We complement the experiments with qualitative analyses on hyper-parameters and the model's ability to answer probabilistic queries.
Paper Structure (32 sections, 10 equations, 6 figures, 8 tables, 1 algorithm)

This paper contains 32 sections, 10 equations, 6 figures, 8 tables, 1 algorithm.

Figures (6)

  • Figure 1: Inference on the graphical model on the left is typically unfeasible due to the mutual dependencies induced by cycles. Therefore, we approximate the learning problem using probabilistic computational trees of height $L$ modeled by a hierarchy of tractable SPNs (right). Note that trees of height $L-1$ are used in the construction process of trees of height $L$. Also, for each tree rooted at $v$ we visualize the mapping $m_v(\cdot)$ using colors and indices corresponding to the original graph (left).
  • Figure 2: (Left) We expand the example of Figure \ref{['fig:unfolding']} to illustrate how the prior distribution of the top SPN (here the Naıve Bayes on the right) is parametrized by a learnable transformation of the children's SPNs posterior mixture probabilities. (Right) A Gaussian Naıve Bayes graphical model with r.v.$\bm{X}=(A_1,A_2)$ and its equivalent SPN with scope $\{A_1,A_2\}$.
  • Figure 3: Mean and std results on scarce supervision tasks averaged over 10 runs, with $0.1\%$. The best and second-best average results are bold and underlined, respectively.
  • Figure 4: Relative change in vertices pseudo log-likelihood when replacing Cl in the SMILES with an O.
  • Figure 5: Impact of GSPN$_S$ layers, $C$ and $C_G$ on NCI1 (top left), COLLAB (top right), REDDIT-BINARY (bottom left), and REDDIT-5K (bottom right) performances, averaged across all configurations in the 10 outer folds.
  • ...and 1 more figures