The Complexity of Bayesian Network Learning: Revisiting the Superstructure
Robert Ganian, Viktoriia Korchemna
TL;DR
The paper advances the parameterized complexity landscape of Bayesian Network Structure Learning (BNSL) by analyzing superstructure-based parameters. It establishes fixed-parameter tractability for BNSL under edge-deletion distance via a polynomial kernel with at most $16k$ variables when parameterized by the feedback edge number $\operatorname{fen}$, and also proves FPT results for the localized variant $\operatorname{lfen}$ through dynamic programming, while providing tight hardness results for treecut width and related parameters. It further distinguishes between the non-zero and additive score representations, showing that the additive representation yields FPT results with treewidth, and extends these insights to Polytree Learning (PL), including NP-hardness in the additive setting and a polynomial-time solution for PL$^+$. Overall, the work maps the tractability boundaries across a broad set of graph-structural parameters and input representations, offering a framework that blends kernelization, Courcelle-type reasoning, and matroid-based methods for tractable learning of graphical models.
Abstract
We investigate the parameterized complexity of Bayesian Network Structure Learning (BNSL), a classical problem that has received significant attention in empirical but also purely theoretical studies. We follow up on previous works that have analyzed the complexity of BNSL w.r.t. the so-called superstructure of the input. While known results imply that BNSL is unlikely to be fixed-parameter tractable even when parameterized by the size of a vertex cover in the superstructure, here we show that a different kind of parameterization - notably by the size of a feedback edge set - yields fixed-parameter tractability. We proceed by showing that this result can be strengthened to a localized version of the feedback edge set, and provide corresponding lower bounds that complement previous results to provide a complexity classification of BNSL w.r.t. virtually all well-studied graph parameters. We then analyze how the complexity of BNSL depends on the representation of the input. In particular, while the bulk of past theoretical work on the topic assumed the use of the so-called non-zero representation, here we prove that if an additive representation can be used instead then BNSL becomes fixed-parameter tractable even under significantly milder restrictions to the superstructure, notably when parameterized by the treewidth alone. Last but not least, we show how our results can be extended to the closely related problem of Polytree Learning.
