Structural perspective on constraint-based learning of Markov networks
Tuukka Korhonen, Fedor V. Fomin, Pekka Parviainen
TL;DR
This work analyzes the complexity of constraint-based structure learning for Markov networks under faithfulness and an independence oracle, focusing on how graph structure governs conditioning-set sizes and the number of tests. It identifies the maximum pairwise connectivity $κ(G)$ as the key parameter, proving a tight dichotomy: at least one test of size $κ(G)$ is necessary and sufficiency can be achieved with tests of size $κ(G)$, via a constructive $|V(G)|^{κ(G)+2}$-test algorithm. It also proves matching lower bounds showing some graphs require $|V(G)|^{Ω(κ)}$ tests, even under bounded treewidth or degree, while graphs with bounded treewidth admit a polynomial-number-of-tests approach with conditioning sets bounded by $2κ$. A treewidth-driven refinement yields an algorithm that learns the underlying graph with $|V(G)|^{O(tw(G))}$ tests of size $O(tw(G))$ plus a small set of tests of size at most $2κ(G)$. Overall, the results illuminate a structural-graph-theory perspective on constraint-based learning and motivate future work to identify even finer parameters that can further reduce testing complexity.
Abstract
Markov networks are probabilistic graphical models that employ undirected graphs to depict conditional independence relationships among variables. Our focus lies in constraint-based structure learning, which entails learning the undirected graph from data through the execution of conditional independence tests. We establish theoretical limits concerning two critical aspects of constraint-based learning of Markov networks: the number of tests and the sizes of the conditioning sets. These bounds uncover an exciting interplay between the structural properties of the graph and the amount of tests required to learn a Markov network. The starting point of our work is that the graph parameter maximum pairwise connectivity, $κ$, that is, the maximum number of vertex-disjoint paths connecting a pair of vertices in the graph, is responsible for the sizes of independence tests required to learn the graph. On one hand, we show that at least one test with the size of the conditioning set at least $κ$ is always necessary. On the other hand, we prove that any graph can be learned by performing tests of size at most $κ$. This completely resolves the question of the minimum size of conditioning sets required to learn the graph. When it comes to the number of tests, our upper bound on the sizes of conditioning sets implies that every $n$-vertex graph can be learned by at most $n^κ$ tests with conditioning sets of sizes at most $κ$. We show that for any upper bound $q$ on the sizes of the conditioning sets, there exist graphs with $O(n q)$ vertices that require at least $n^{Ω(κ)}$ tests to learn. This lower bound holds even when the treewidth and the maximum degree of the graph are at most $κ+2$. On the positive side, we prove that every graph of bounded treewidth can be learned by a polynomial number of tests with conditioning sets of sizes at most $2κ$.
