Table of Contents
Fetching ...

Semi-parametric Expert Bayesian Network Learning with Gaussian Processes and Horseshoe Priors

Yidou Weng, Finale Doshi-Velez

TL;DR

SEBN advances expert Bayesian networks by jointly learning linear and nonlinear edge components through Gaussian Processes while regularizing with a Horseshoe prior. By using differential Horseshoe scales, it prioritizes modifying expert edges over adding non-expert ones, addressing identifiability and interpretability. The authors implement an exact Dynamic Programming–based structure search under a partial topological order and evaluate on synthetic data and the UCI Liver Disorders dataset, showing improvements over state-of-the-art SPBN in structural accuracy (SHD) and test likelihood. The work offers diverse, interpretable graph options for real-world settings where ground truth is unknown and expert knowledge should guide learning, with potential healthcare applications and extensions to partially observed data.

Abstract

This paper proposes a model learning Semi-parametric relationships in an Expert Bayesian Network (SEBN) with linear parameter and structure constraints. We use Gaussian Processes and a Horseshoe prior to introduce minimal nonlinear components. To prioritize modifying the expert graph over adding new edges, we optimize differential Horseshoe scales. In real-world datasets with unknown truth, we generate diverse graphs to accommodate user input, addressing identifiability issues and enhancing interpretability. Evaluation on synthetic and UCI Liver Disorders datasets, using metrics like structural Hamming Distance and test likelihood, demonstrates our models outperform state-of-the-art semi-parametric Bayesian Network model.

Semi-parametric Expert Bayesian Network Learning with Gaussian Processes and Horseshoe Priors

TL;DR

SEBN advances expert Bayesian networks by jointly learning linear and nonlinear edge components through Gaussian Processes while regularizing with a Horseshoe prior. By using differential Horseshoe scales, it prioritizes modifying expert edges over adding non-expert ones, addressing identifiability and interpretability. The authors implement an exact Dynamic Programming–based structure search under a partial topological order and evaluate on synthetic data and the UCI Liver Disorders dataset, showing improvements over state-of-the-art SPBN in structural accuracy (SHD) and test likelihood. The work offers diverse, interpretable graph options for real-world settings where ground truth is unknown and expert knowledge should guide learning, with potential healthcare applications and extensions to partially observed data.

Abstract

This paper proposes a model learning Semi-parametric relationships in an Expert Bayesian Network (SEBN) with linear parameter and structure constraints. We use Gaussian Processes and a Horseshoe prior to introduce minimal nonlinear components. To prioritize modifying the expert graph over adding new edges, we optimize differential Horseshoe scales. In real-world datasets with unknown truth, we generate diverse graphs to accommodate user input, addressing identifiability issues and enhancing interpretability. Evaluation on synthetic and UCI Liver Disorders datasets, using metrics like structural Hamming Distance and test likelihood, demonstrates our models outperform state-of-the-art semi-parametric Bayesian Network model.
Paper Structure (31 sections, 14 equations, 4 figures, 1 table, 1 algorithm)

This paper contains 31 sections, 14 equations, 4 figures, 1 table, 1 algorithm.

Figures (4)

  • Figure 1: In Independent-addition Dataset, (a) and (b) At all network sizes, jointly-training linear and GP parameters got about 50% closer to the oracle solution than the two-stage approach. (c) SEBN, with or without Horseshoe prior, significantly outperformed the state-of-the-art baseline SPBN
  • Figure 2: In Independent-addition dataset, (a) and (b): In SHD and Test likelihood, all models with the Horseshoe prior outperformed one without. The performance trends across different Horseshoe scales had a U-shaped pattern. Too small or large scales penalized GP edges too severely or lightly, resulting in incorrect structure. $\tau = 5$ emerged as the optimal choice. (c) and (d): In SHD and Test likelihood, $w_{HS} = 1$ consistently outperformed $w_{HS} = 10$.
  • Figure 3: In Expert-guided dataset, Horseshoe prior improved SHD (a) and test likelihood (b) in most cases. The model with differential Horseshoe prior scales consistently outperforms those with uniform scales, be it too small ($\tau = 0.001$) or too large ($\tau = 5$ ).
  • Figure 4: Learned graphs for UCI Liver Disorders: (a) Expert graph learned as a linear GBN. (b) Graph learned with No HS, incorporating 6 additional GP edges with a test likelihood of -72.31. (c) Uniform HS scale $\tau = 5$ , adding 3 GP edges and achieving a test log likelihood of -72.31. (d) Differential HS scales $\tau = 5$ and $\tau = 0.001$ for expert and non-expert graphs respectively, introducing 3 GP edges with a test likelihood of -60.56. (e) Uniform HS scale $\tau = 0.001$ , adding 1 GP edge with a test likelihood of -60.57. (f) SPBN, learning 4 non-parametric edges with a test likelihood of -213.06.