Polynomial-time derivation of optimal k-tree topology from Markov networks
Fereshteh R. Dastjerdi, Liming Cai
TL;DR
This work addresses the challenge of efficiently approximating high-dimensional joint distributions by Markov networks whose topology has bounded tree-width $k$, extending the classical Chow-Liu tree approach to $k$-trees. It frames the optimality criterion as a maximum spanning $k$-tree (MS$k$T) problem with $f(\Delta)=I(X; \pi(X))$ and introduces the $\beta$-retaining MS$k$T variant to preserve designated subgraphs, enabling retention of critical subnetworks such as biological pathways or backbone connectivity. The authors show that for graph classes $\beta$ that are bounded-branching friendly (notably bounded-degree spanning trees), the $\beta$-retaining MS$k$T problem is solvable in polynomial time $O(n^{k+2})$ for fixed $k$, with potential improvements to $O(n^{k+1})$ in special cases and strong evidence of optimality via a reduction from $k$-Clique. This framework provides a practical, theoretically grounded method for efficient, loss-minimizing approximation of joint distributions in domains where preserving known substructures is essential, such as gene networks and biomolecule backbone modeling.
Abstract
Characterization of joint probability distribution for large networks of random variables remains a challenging task in data science. Probabilistic graph approximation with simple topologies has practically been resorted to; typically the tree topology makes joint probability computation much simpler and can be effective for statistical inference on insufficient data. However, to characterize network components where multiple variables cooperate closely to influence others, model topologies beyond a tree are needed, which unfortunately are infeasible to acquire. In particular, our previous work has related optimal approximation of Markov networks of tree-width k >=2 closely to the graph-theoretic problem of finding maximum spanning k-tree (MSkT), which is a provably intractable task. This paper investigates optimal approximation of Markov networks with k-tree topology that retains some designated underlying subgraph. Such a subgraph may encode certain background information that arises in scientific applications, for example, about a known significant pathway in gene networks or the indispensable backbone connectivity in the residue interaction graphs for a biomolecule 3D structure. In particular, it is proved that the β-retaining MSkT problem, for a number of classes βof graphs, admit O(n^{k+1})-time algorithms for every fixed k>= 1. These β-retaining MSkT algorithms offer efficient solutions for approximation of Markov networks with k-tree topology in the situation where certain persistent information needs to be retained.
