Table of Contents
Fetching ...

Learnability of Parameter-Bounded Bayes Nets

Arnab Bhattacharyya, Davin Choo, Sutanu Gayen, Dimitrios Myrisiotis

TL;DR

This work proves that learning parameter-bounded Bayesian networks remains NP-hard even when a distribution is guaranteed to be Markov to some network with a fixed parameter budget. It extends prior hardness results and furnishes a finite-sample guarantee: given samples from a distribution that is Markov to a $p$-parameter Bayes net, one can efficiently select a network and construct a close approximation $\mathbb{Q}$ with $d_{TV}(\mathbb{P}, \mathbb{Q}) \le \varepsilon$, using a bound that scales with $p$, $n$, and alphabet size $|\Sigma|$. The methodological core combines independence-oracle-based reductions, epsilon-net constructions over network structures, and a Scheffé-style tournament to pick a near-best candidate from a finite set, generalizing degree-bounded results to the parameter-bounded regime. These results illuminate both the computational hardness and the learnability landscape for compact probabilistic models, with implications for structure learning under tight resource constraints.

Abstract

Bayes nets are extensively used in practice to efficiently represent joint probability distributions over a set of random variables and capture dependency relations. In a seminal paper, Chickering et al. (JMLR 2004) showed that given a distribution $\mathbb{P}$, that is defined as the marginal distribution of a Bayes net, it is $\mathsf{NP}$-hard to decide whether there is a parameter-bounded Bayes net that represents $\mathbb{P}$. They called this problem LEARN. In this work, we extend the $\mathsf{NP}$-hardness result of LEARN and prove the $\mathsf{NP}$-hardness of a promise search variant of LEARN, whereby the Bayes net in question is guaranteed to exist and one is asked to find such a Bayes net. We complement our hardness result with a positive result about the sample complexity that is sufficient to recover a parameter-bounded Bayes net that is close (in TV distance) to a given distribution $\mathbb{P}$, that is represented by some parameter-bounded Bayes net, generalizing a degree-bounded sample complexity result of Brustle et al. (EC 2020).

Learnability of Parameter-Bounded Bayes Nets

TL;DR

This work proves that learning parameter-bounded Bayesian networks remains NP-hard even when a distribution is guaranteed to be Markov to some network with a fixed parameter budget. It extends prior hardness results and furnishes a finite-sample guarantee: given samples from a distribution that is Markov to a -parameter Bayes net, one can efficiently select a network and construct a close approximation with , using a bound that scales with , , and alphabet size . The methodological core combines independence-oracle-based reductions, epsilon-net constructions over network structures, and a Scheffé-style tournament to pick a near-best candidate from a finite set, generalizing degree-bounded results to the parameter-bounded regime. These results illuminate both the computational hardness and the learnability landscape for compact probabilistic models, with implications for structure learning under tight resource constraints.

Abstract

Bayes nets are extensively used in practice to efficiently represent joint probability distributions over a set of random variables and capture dependency relations. In a seminal paper, Chickering et al. (JMLR 2004) showed that given a distribution , that is defined as the marginal distribution of a Bayes net, it is -hard to decide whether there is a parameter-bounded Bayes net that represents . They called this problem LEARN. In this work, we extend the -hardness result of LEARN and prove the -hardness of a promise search variant of LEARN, whereby the Bayes net in question is guaranteed to exist and one is asked to find such a Bayes net. We complement our hardness result with a positive result about the sample complexity that is sufficient to recover a parameter-bounded Bayes net that is close (in TV distance) to a given distribution , that is represented by some parameter-bounded Bayes net, generalizing a degree-bounded sample complexity result of Brustle et al. (EC 2020).
Paper Structure (21 sections, 13 theorems, 19 equations, 2 figures)

This paper contains 21 sections, 13 theorems, 19 equations, 2 figures.

Key Result

Theorem 1.3

REALIZABLE-LEARN is $\mathsf{NP}$-hard.

Figures (2)

  • Figure 1: Left: A Bayes net $\mathcal{G}$ such that the distribution $\mathbbm{P}$ of our example is represented by $\mathcal{G}$. Right: A Bayes net $\mathcal{H}$ such that the distribution that arises from the distribution $\mathbbm{P}$ after marginalizing out $X_3$ is represented by $\mathcal{H}$.
  • Figure 2: Gav77 showed that DBFAS is $\mathsf{NP}$-hard and chickering2004large showed that LEARN-DBFAS is $\mathsf{NP}$-hard, even when given access to an independence oracle for $\mathbbm{P}$. REALIZABLE-LEARN is a variant of LEARN-DBFAS with the additional promise that there exists a Bayes net $\mathcal{G}$ with at most $p$ parameters such that $\mathbbm{P}$ is Markov with respect to $\mathcal{G}$. In this work, we show that if one can learn such a Bayes net $\mathcal{G}$ (via some blackbox polynomial time algorithm Learner), then there is a polynomial time algorithm Reduction that correctly answers LEARN-DBFAS. Therefore, REALIZABLE-LEARN is also $\mathsf{NP}$-hard.

Theorems & Definitions (28)

  • Remark 1.1
  • Definition 1.2: The REALIZABLE-LEARN problem
  • Theorem 1.3
  • Theorem 1.4: Approximating parameter-bounded Bayes nets using samples
  • Definition 2.1: The DBFAS decision problem
  • Definition 2.2: The LEARN decision problem
  • Definition 2.3: The LEARN-DBFAS decision problem
  • Theorem 2.4: chickering2004large
  • Theorem 2.5: daskalakis2014faster
  • Lemma 2.6: brustle2020multi
  • ...and 18 more