Learning Linear Polytree Structural Equation Models
Xingmei Lou, Yu Hu, Xiaodong Li
TL;DR
This work addresses learning the skeleton and CPDAG of Gaussian linear polytree SEMs from i.i.d. data, and provides sharp sample-size characterizations. The approach combines Chow-Liu skeleton recovery on pairwise correlations with threshold-based v-structure detection and Meek-style orientation, complemented by a PC-adapted variant and inverse-correlation-matrix estimation; it further extends to group polytree models. The main contributions include sufficiency results $n > O\left(\frac{\log p}{\rho_{\min}^2}\right)$ for skeleton and $n > O\left(\frac{\log p}{\rho_{\min}^4}\right)$ for CPDAG, accompanied by information-theoretic lower bounds that establish sharpness, plus $\ell_1$ error bounds for inverse-correlation estimation and a group-polytree extension. Empirical results on simulated and benchmark data demonstrate the method's accuracy and scalability, with robustness to approximate polytree structures and practical performance advantages over traditional DAG-learning methods.
Abstract
We are interested in the problem of learning the directed acyclic graph (DAG) when data are generated from a linear structural equation model (SEM) and the causal structure can be characterized by a polytree. Under the Gaussian polytree models, we study sufficient conditions on the sample sizes for the well-known Chow-Liu algorithm to exactly recover both the skeleton and the equivalence class of the polytree, which is uniquely represented by a CPDAG. On the other hand, necessary conditions on the required sample sizes for both skeleton and CPDAG recovery are also derived in terms of information-theoretic lower bounds, which match the respective sufficient conditions and thereby give a sharp characterization of the difficulty of these tasks. We also consider the problem of inverse correlation matrix estimation under the linear polytree models, and establish the estimation error bound in terms of the dimension and the total number of v-structures. We also consider an extension of group linear polytree models, in which each node represents a group of variables. Our theoretical findings are illustrated by comprehensive numerical simulations, and experiments on benchmark data also demonstrate the robustness of polytree learning when the true graphical structures can only be approximated by polytrees.
