Table of Contents
Fetching ...

A Note on Bayesian Networks with Latent Root Variables

Marco Zaffalon, Alessandro Antonucci

TL;DR

This note addresses learning in Bayesian networks with latent root variables, where incomplete observations create nonconvex likelihood surfaces and risk of local maxima. It shows that marginalising latent roots yields an empirical BN over manifest variables with $P(\bm{Z})=\prod_{Z\in\bm{Z}} P(z|\bm{w}_Z)$, and constructs an auxiliary-root transformation to connect the original and empirical models. The main result proves that the global maximum of the original log-likelihood is bounded by, and achieves equality with, the empirical maximum $\lambda^*$ precisely when the data are compatible with the original BN; this provides a principled compatibility test for EM-based learning. The findings offer a practical criterion to certify global optimality in the presence of latent roots and indicate directions for extending the framework to continuous variables.

Abstract

We characterise the likelihood function computed from a Bayesian network with latent variables as root nodes. We show that the marginal distribution over the remaining, manifest, variables also factorises as a Bayesian network, which we call empirical. A dataset of observations of the manifest variables allows us to quantify the parameters of the empirical Bayesian net. We prove that (i) the likelihood of such a dataset from the original Bayesian network is dominated by the global maximum of the likelihood from the empirical one; and that (ii) such a maximum is attained if and only if the parameters of the Bayesian network are consistent with those of the empirical model.

A Note on Bayesian Networks with Latent Root Variables

TL;DR

This note addresses learning in Bayesian networks with latent root variables, where incomplete observations create nonconvex likelihood surfaces and risk of local maxima. It shows that marginalising latent roots yields an empirical BN over manifest variables with , and constructs an auxiliary-root transformation to connect the original and empirical models. The main result proves that the global maximum of the original log-likelihood is bounded by, and achieves equality with, the empirical maximum precisely when the data are compatible with the original BN; this provides a principled compatibility test for EM-based learning. The findings offer a practical criterion to certify global optimality in the presence of latent roots and indicate directions for extending the framework to continuous variables.

Abstract

We characterise the likelihood function computed from a Bayesian network with latent variables as root nodes. We show that the marginal distribution over the remaining, manifest, variables also factorises as a Bayesian network, which we call empirical. A dataset of observations of the manifest variables allows us to quantify the parameters of the empirical Bayesian net. We prove that (i) the likelihood of such a dataset from the original Bayesian network is dominated by the global maximum of the likelihood from the empirical one; and that (ii) such a maximum is attained if and only if the parameters of the Bayesian network are consistent with those of the empirical model.
Paper Structure (4 sections, 2 theorems, 8 equations, 2 figures)

This paper contains 4 sections, 2 theorems, 8 equations, 2 figures.

Key Result

Proposition 1

The marginal PMF over the internal variables $P(\bm{Z})$ factorises as: for each $\bm{z}$, where the states $(z,\bm{w}_z)$ are those consistent with $\bm{z}$, and $\bm{W}_Z$ denotes the union of the internal variables of the same c-component of $Z$ and their internal parents, after the removal of $Z$ and its descendants.

Figures (2)

  • Figure 1: A BN over two Boolean variables and its CPTs.
  • Figure 2: A BN with deterministic internal CPTs.

Theorems & Definitions (3)

  • Proposition 1
  • Theorem 1
  • proof