Table of Contents
Fetching ...

Renormalized Normalized Maximum Likelihood and Three-Part Code Criteria For Learning Gaussian Networks

Borzou Alipourfard, Jean X. Gao

TL;DR

Two new scoring metrics for scoring Bayesian networks in the continuous domain are introduced: the three-part minimum description length and the renormalized normalized maximum likelihood metric, which are free of hyperparameters, decomposable, and are asymptotically consistent.

Abstract

Score based learning (SBL) is a promising approach for learning Bayesian networks in the discrete domain. However, when employing SBL in the continuous domain, one is either forced to move the problem to the discrete domain or use metrics such as BIC/AIC, and these approaches are often lacking. Discretization can have an undesired impact on the accuracy of the results, and BIC/AIC can fall short of achieving the desired accuracy. In this paper, we introduce two new scoring metrics for scoring Bayesian networks in the continuous domain: the three-part minimum description length and the renormalized normalized maximum likelihood metric. We rely on the minimum description length principle in formulating these metrics. The metrics proposed are free of hyperparameters, decomposable, and are asymptotically consistent. We evaluate our solution by studying the convergence rate of the learned graph to the generating network and, also, the structural hamming distance of the learned graph to the generating network. Our evaluations show that the proposed metrics outperform their competitors, the BIC/AIC metrics. Furthermore, using the proposed RNML metric, SBL will have the fastest rate of convergence with the smallest structural hamming distance to the generating network.

Renormalized Normalized Maximum Likelihood and Three-Part Code Criteria For Learning Gaussian Networks

TL;DR

Two new scoring metrics for scoring Bayesian networks in the continuous domain are introduced: the three-part minimum description length and the renormalized normalized maximum likelihood metric, which are free of hyperparameters, decomposable, and are asymptotically consistent.

Abstract

Score based learning (SBL) is a promising approach for learning Bayesian networks in the discrete domain. However, when employing SBL in the continuous domain, one is either forced to move the problem to the discrete domain or use metrics such as BIC/AIC, and these approaches are often lacking. Discretization can have an undesired impact on the accuracy of the results, and BIC/AIC can fall short of achieving the desired accuracy. In this paper, we introduce two new scoring metrics for scoring Bayesian networks in the continuous domain: the three-part minimum description length and the renormalized normalized maximum likelihood metric. We rely on the minimum description length principle in formulating these metrics. The metrics proposed are free of hyperparameters, decomposable, and are asymptotically consistent. We evaluate our solution by studying the convergence rate of the learned graph to the generating network and, also, the structural hamming distance of the learned graph to the generating network. Our evaluations show that the proposed metrics outperform their competitors, the BIC/AIC metrics. Furthermore, using the proposed RNML metric, SBL will have the fastest rate of convergence with the smallest structural hamming distance to the generating network.

Paper Structure

This paper contains 11 sections, 41 equations, 4 figures, 1 table.

Figures (4)

  • Figure 1: The average rank of the generating network structure for networks having $m$ nodes, $m \in \{4,5\}$, over 500 iterations plotted against the sample size. The upper plot shows the results for networks having four nodes, while the convergence rate for networks having five nodes is shown in the lower plot.
  • Figure 2: Average SHD between the generating DAG and the prime DAG in simulations with $m=8$. The upper plot shows the results for graphs where the expected number of neighbours for each node was set to $nn = 2$. The middle and the lower plots show the results for $nn = 4$ and $nn = 6$.
  • Figure 3: Average SHD between the generating DAG and the prime DAG in simulations with $m=10$. The upper plot shows the results for graphs where the expected number of neighbours for each node was set to $nn = 2$. The middle and the lower plots show the results for $nn = 4$ and $nn = 6$.
  • Figure 4: Average SHD between the generating DAG and the prime DAG in simulations with $m=15$. The upper plot shows the results for graphs where the expected number of neighbours for each node was set to $nn = 2$. The middle and the lower plots show the results for $nn = 4$ and $nn = 6$.