Table of Contents
Fetching ...

The Impact of Variable Ordering on Bayesian Network Structure Learning

Neville K Kitson, Anthony C Constantinou

TL;DR

This paper investigates how the arbitrary order of variables in data inputs can dramatically affect Bayesian network structure learning, often more than traditional factors like sample size or hyper-parameters. By evaluating 16 discrete networks across multiple algorithms (HC, Tabu, MMHC, H2PC, PC-Stable, GS, Inter-IAMB) and varying variable orders, the authors demonstrate that ordering can cause substantial swings in structural accuracy as measured by F1 on CPDAGs, with some networks showing differences exceeding 0.5. They show that HC is particularly vulnerable to order-induced, arbitrary orientation decisions early in learning, though ordering also influences other algorithms to varying degrees; constraint-based methods tend to be less sensitive, yet not immune. The findings raise questions about the validity of many comparative benchmarks and published results that overlook variable-order sensitivity, and they advocate for mitigations such as Bayesian Model Averaging or ordering-space searches, with potential extensions to continuous data and causal inference tasks. Overall, the work highlights a previously underappreciated source of variability in structure learning that has practical implications for causal discovery and intervention modelling.

Abstract

Causal Bayesian Networks provide an important tool for reasoning under uncertainty with potential application to many complex causal systems. Structure learning algorithms that can tell us something about the causal structure of these systems are becoming increasingly important. In the literature, the validity of these algorithms is often tested for sensitivity over varying sample sizes, hyper-parameters, and occasionally objective functions. In this paper, we show that the order in which the variables are read from data can have much greater impact on the accuracy of the algorithm than these factors. Because the variable ordering is arbitrary, any significant effect it has on learnt graph accuracy is concerning, and this raises questions about the validity of the results produced by algorithms that are sensitive to, but have not been assessed against, different variable orderings.

The Impact of Variable Ordering on Bayesian Network Structure Learning

TL;DR

This paper investigates how the arbitrary order of variables in data inputs can dramatically affect Bayesian network structure learning, often more than traditional factors like sample size or hyper-parameters. By evaluating 16 discrete networks across multiple algorithms (HC, Tabu, MMHC, H2PC, PC-Stable, GS, Inter-IAMB) and varying variable orders, the authors demonstrate that ordering can cause substantial swings in structural accuracy as measured by F1 on CPDAGs, with some networks showing differences exceeding 0.5. They show that HC is particularly vulnerable to order-induced, arbitrary orientation decisions early in learning, though ordering also influences other algorithms to varying degrees; constraint-based methods tend to be less sensitive, yet not immune. The findings raise questions about the validity of many comparative benchmarks and published results that overlook variable-order sensitivity, and they advocate for mitigations such as Bayesian Model Averaging or ordering-space searches, with potential extensions to continuous data and causal inference tasks. Overall, the work highlights a previously underappreciated source of variability in structure learning that has practical implications for causal discovery and intervention modelling.

Abstract

Causal Bayesian Networks provide an important tool for reasoning under uncertainty with potential application to many complex causal systems. Structure learning algorithms that can tell us something about the causal structure of these systems are becoming increasingly important. In the literature, the validity of these algorithms is often tested for sensitivity over varying sample sizes, hyper-parameters, and occasionally objective functions. In this paper, we show that the order in which the variables are read from data can have much greater impact on the accuracy of the algorithm than these factors. Because the variable ordering is arbitrary, any significant effect it has on learnt graph accuracy is concerning, and this raises questions about the validity of the results produced by algorithms that are sensitive to, but have not been assessed against, different variable orderings.
Paper Structure (12 sections, 6 equations, 10 figures, 3 tables)

This paper contains 12 sections, 6 equations, 10 figures, 3 tables.

Figures (10)

  • Figure 1: DAGs with three variables and two edges, their distinctive independence or dependence relationships, the MECs they belong to and the CPDAG representing the MEC.
  • Figure 2: Arbitrary DAG changes in HC algorithm using 10,000 rows of data
  • Figure 3: F1 against sample size for each variable ordering and network for the HC algorithm. Each plot starts at the sample size at which there are no single-valued variables. (Note that the red and black lines are coincident for the Sachs network which is why the former is not visible).
  • Figure 4: Impact on F1 scores (CPDAG) of changing sample size, variable ordering, score or hyper-parameters across all networks using the HC algorithm with sample sizes between $10^3$ and $10^6$. Each plot shows the mean change as a number, the median change as a horizontal black line, the interquartile range as the coloured rectangle, and the minimum and maximum values as whiskers.
  • Figure 5: The sensitivity of different algorithms to variable ordering compared to other factors in terms of impact on F1 score (CPDAG). The comparisons are: sample size increased by 100 times; variable ordering changed from worst to optimal; score function changed from BIC to BDeu for score-based and hybrid algorithms or the CI test from Mutual Information to Chi-squared for constraint-based algorithms; and, the complexity scaling hyper-parameter is changed from 1 to 5 for score-based and hybrid algorithms or the p-value significance hyper-parameter changed from 0.05 to 0.01 for constraint-based algorithms.
  • ...and 5 more figures