Table of Contents
Fetching ...

Near-Optimal Experiment Design in Linear non-Gaussian Cyclic Models

Ehsan Sharifian, Saber Salehkaleybar, Negar Kiyavash

TL;DR

<3-5 sentence high-level summary>Addresses causal structure learning in cyclic linear non-Gaussian SCMs, showing observational data yields a permutation-equivalence class of graphs. The authors introduce a bipartite-matching representation of this class and prove that single-node interventions progressively reveal true matching edges, shrinking the class. They formulate adaptive experiment design as a submodular optimization with a sampling-based reward estimator, enabling near-optimal greedy strategies with guarantees. Empirical results on synthetic graphs demonstrate strong performance, closely approaching the fundamental FVS lower bound and robust behavior under finite-sample ICA, with extensions to multi-node interventions and non-linear settings discussed.

Abstract

We study the problem of causal structure learning from a combination of observational and interventional data generated by a linear non-Gaussian structural equation model that might contain cycles. Recent results show that using mere observational data identifies the causal graph only up to a permutation-equivalence class. We obtain a combinatorial characterization of this class by showing that each graph in an equivalence class corresponds to a perfect matching in a bipartite graph. This bipartite representation allows us to analyze how interventions modify or constrain the matchings. Specifically, we show that each atomic intervention reveals one edge of the true matching and eliminates all incompatible causal graphs. Consequently, we formalize the optimal experiment design task as an adaptive stochastic optimization problem over the set of equivalence classes with a natural reward function that quantifies how many graphs are eliminated from the equivalence class by an intervention. We show that this reward function is adaptive submodular and provide a greedy policy with a provable near-optimal performance guarantee. A key technical challenge is to efficiently estimate the reward function without having to explicitly enumerate all the graphs in the equivalence class. We propose a sampling-based estimator using random matchings and analyze its bias and concentration behavior. Our simulation results show that performing a small number of interventions guided by our stochastic optimization framework recovers the true underlying causal structure.

Near-Optimal Experiment Design in Linear non-Gaussian Cyclic Models

TL;DR

<3-5 sentence high-level summary>Addresses causal structure learning in cyclic linear non-Gaussian SCMs, showing observational data yields a permutation-equivalence class of graphs. The authors introduce a bipartite-matching representation of this class and prove that single-node interventions progressively reveal true matching edges, shrinking the class. They formulate adaptive experiment design as a submodular optimization with a sampling-based reward estimator, enabling near-optimal greedy strategies with guarantees. Empirical results on synthetic graphs demonstrate strong performance, closely approaching the fundamental FVS lower bound and robust behavior under finite-sample ICA, with extensions to multi-node interventions and non-linear settings discussed.

Abstract

We study the problem of causal structure learning from a combination of observational and interventional data generated by a linear non-Gaussian structural equation model that might contain cycles. Recent results show that using mere observational data identifies the causal graph only up to a permutation-equivalence class. We obtain a combinatorial characterization of this class by showing that each graph in an equivalence class corresponds to a perfect matching in a bipartite graph. This bipartite representation allows us to analyze how interventions modify or constrain the matchings. Specifically, we show that each atomic intervention reveals one edge of the true matching and eliminates all incompatible causal graphs. Consequently, we formalize the optimal experiment design task as an adaptive stochastic optimization problem over the set of equivalence classes with a natural reward function that quantifies how many graphs are eliminated from the equivalence class by an intervention. We show that this reward function is adaptive submodular and provide a greedy policy with a provable near-optimal performance guarantee. A key technical challenge is to efficiently estimate the reward function without having to explicitly enumerate all the graphs in the equivalence class. We propose a sampling-based estimator using random matchings and analyze its bias and concentration behavior. Our simulation results show that performing a small number of interventions guided by our stochastic optimization framework recovers the true underlying causal structure.

Paper Structure

This paper contains 64 sections, 10 theorems, 73 equations, 6 figures, 1 table, 2 algorithms.

Key Result

Theorem 4.4

All proofs are provided in the appendix. Strongly connected components remain unchanged in distribution-equivalent graphs.

Figures (6)

  • Figure 1: (a) Graph representation of cycle reversion. (b) The figure highlights representative nodes from an SCC to show that their reachability is maintained, although the full SCC is not depicted.
  • Figure 2: Comparison of intervention strategies assuming an ideal ICA oracle. Our method (Adaptive) consistently performs close to the feedback vertex set (FVS) lower bound.
  • Figure 3: (a) Each row permutation corresponds to a matching between sources and observed variables. (b) An intervention on a specific variable reveals its correct row assignment, narrowing the equivalence class.
  • Figure 4: Illustration of equivalence class partitioning induced by a candidate intervention on variable $v=4$. The top node represents the current equivalence class $\Omega_1 = \{G_1, \dots, G_N\}$, and each child $\Omega_i^v$ consists of graphs whose perfect matching assigns variable $v$ to row $r_{z_i}$, i.e., includes the edge $(r_{z_i}, v)$. The bipartite graphs below depict three such subsets, each highlighting a different candidate edge $(r_{z_i}, v)$ in red. This illustrates how an intervention on $v$ resolves ambiguity by eliminating all but one of these subsets, reducing the equivalence class.
  • Figure 5: Comparison of intervention strategies under the ideal ICA assumption with a sample-based matching sampler. Error bars indicate standard deviation over trials.
  • ...and 1 more figures

Theorems & Definitions (23)

  • Definition 4.1: Cycle Reversion
  • Definition 4.2: Strongly Connected Component (SCC)
  • Definition 4.3: Condensation Graph
  • Theorem 4.4
  • Corollary 4.5
  • Proposition 5.1
  • Definition 6.1: Universe and Random Realization
  • Definition 6.2: Partial Realization
  • Definition 6.3: Conditional Expected Marginal Benefit
  • Definition 6.4: Adaptive Monotonicity
  • ...and 13 more