Table of Contents
Fetching ...

Estimating Staged Event Tree Models via Hierarchical Clustering on the Simplex

Muhammad Shoaib, Eva Riccomagno, Manuele Leonelli, Gherardo Varando

Abstract

Staged tree models enhance Bayesian networks by incorporating context-specific dependencies through a stage-based structure. In this study, we present a new framework for estimating staged trees using hierarchical clustering on the probability simplex, utilizing simplex basesd divergences. We conduct a thorough evaluation of several distance and divergence metrics including Total Variation, Hellinger, Fisher, and Kaniadakis; alongside various linkage methods such as Ward.D2, average, complete, and McQuitty. We conducted the simulation experiments that reveals Total Variation, especially when combined with Ward.D2 linkage, consistently produces staged trees with better model fit, structure recovery, and computational efficiency. We assess performance by utilizing relative Bayesian Information Criterion (BIC), and Hamming distance. Our findings indicate that although Backward Hill Climbing (BHC) delivers competitive outcomes, it incurs a significantly higher computational cost. On the other, Total Variation divergence with Ward.D2 linkage, achieves similar performance while providing significantly better computational efficiency, making it a more viable option for large-scale or time sensitive tasks.

Estimating Staged Event Tree Models via Hierarchical Clustering on the Simplex

Abstract

Staged tree models enhance Bayesian networks by incorporating context-specific dependencies through a stage-based structure. In this study, we present a new framework for estimating staged trees using hierarchical clustering on the probability simplex, utilizing simplex basesd divergences. We conduct a thorough evaluation of several distance and divergence metrics including Total Variation, Hellinger, Fisher, and Kaniadakis; alongside various linkage methods such as Ward.D2, average, complete, and McQuitty. We conducted the simulation experiments that reveals Total Variation, especially when combined with Ward.D2 linkage, consistently produces staged trees with better model fit, structure recovery, and computational efficiency. We assess performance by utilizing relative Bayesian Information Criterion (BIC), and Hamming distance. Our findings indicate that although Backward Hill Climbing (BHC) delivers competitive outcomes, it incurs a significantly higher computational cost. On the other, Total Variation divergence with Ward.D2 linkage, achieves similar performance while providing significantly better computational efficiency, making it a more viable option for large-scale or time sensitive tasks.
Paper Structure (20 sections, 9 equations, 6 figures, 6 tables, 1 algorithm)

This paper contains 20 sections, 9 equations, 6 figures, 6 tables, 1 algorithm.

Figures (6)

  • Figure 1: Event tree (left) and staged event tree (right) for $(X_1,X_2,X_3)$ with $\mathcal{X}_1={a,b,c}$, $\mathcal{X}_2={0,1}$, and $\mathcal{X}_3={1,2,3}$. Left: edges labeled as $X_i=\cdot$. Right: edges labeled as $X_i=\cdot$ with stage transition probabilities in parentheses.
  • Figure 2: Comparison of $\Delta_{BIC}$ and $\Delta_{HD}$ across divergence-based staged event tree classifiers for $p = 11$, under the following configurations: $k_{0} , k = NA, q = 0.9.$
  • Figure 3: Comparison of $\Delta_{\mathrm{BIC}}$ and $\Delta_{\mathrm{HD}}$ and time across divergence-based staged event tree classifiers for sample sizes $512$ with $k_{0}, k = NA, q = 0.9$
  • Figure 4: Accuracy and f1 for all the staged event tree classifiers utilizing various divergence measures $k = 2$
  • Figure 5: Comparison of $\Delta_{BIC}$ and $\Delta_{HD}$ across divergence-based staged event tree classifiers for $p = 11$, under the following configurations: $k_{0} = 2, k = NA, q = NA$
  • ...and 1 more figures

Theorems & Definitions (3)

  • Definition 1: $X$-compatible event tree
  • Definition 2: $X$-compatible staged event tree
  • Definition 3: Staged event tree model