Table of Contents
Fetching ...

Context-Specific Causal Graph Discovery with Unobserved Contexts: Non-Stationarity, Regimes and Spatio-Temporal Patterns

Martin Rabel, Jakob Runge

TL;DR

This paper tackles learning context-specific causal graphs from non-stationary data by formulating multi-valued causal discovery (MCD) and introducing a modular, locality-focused framework (gLD) that directly tests regime-dependent independencies. It extends standard SCMs to non-stationary settings using regime indicators, defines a state-space of potential model changes, and reframes constraint-based causal discovery to operate over regime-marked independence structures. The core algorithm iteratively combines causal discovery with state-space construction, enabling identifiable graphs per regime while preserving compatibility with existing CD methods like PC, FCI, and PCMCI variants. Through theoretical analysis and numerical experiments, the work demonstrates how locality and direct testing can scale to large graphs and complex regime structures, though it also outlines fundamental limits and the need for large sample sizes. The framework is modular, extensible to context patterns beyond time (e.g., space), and aims to provide interpretable, regime-specific causal graphs with practical applicability in domains such as climate science.

Abstract

Real-world data, for example in climate applications, often consists of spatially gridded time series data or data with comparable structure. While the underlying system is often believed to behave similar at different points in space and time, those variations that do exist are twofold relevant: They often encode important information in and of themselves. And they may negatively affect the stability / convergence and reliability\Slash{}validity of results of algorithms assuming stationarity or space-translation invariance. We study the information encoded in changes of the causal graph, with stability in mind. An analysis of this general task identifies two core challenges. We develop guiding principles to overcome these challenges, and provide a framework realizing these principles by modifying constraint-based causal discovery approaches on the level of independence testing. This leads to an extremely modular, easily extensible and widely applicable framework. It can leverage existing constraint-based causal discovery methods (demonstrated on IID-algorithms PC, PC-stable, FCI and time series algorithms PCMCI, PCMCI+, LPCMCI) with little to no modification. The built-in modularity allows to systematically understand and improve upon an entire array of subproblems. By design, it can be extended by leveraging insights from change-point-detection, clustering, independence-testing and other well-studied related problems. The division into more accessible sub-problems also simplifies the understanding of fundamental limitations, hyperparameters controlling trade-offs and the statistical interpretation of results. An open-source implementation will be available soon.

Context-Specific Causal Graph Discovery with Unobserved Contexts: Non-Stationarity, Regimes and Spatio-Temporal Patterns

TL;DR

This paper tackles learning context-specific causal graphs from non-stationary data by formulating multi-valued causal discovery (MCD) and introducing a modular, locality-focused framework (gLD) that directly tests regime-dependent independencies. It extends standard SCMs to non-stationary settings using regime indicators, defines a state-space of potential model changes, and reframes constraint-based causal discovery to operate over regime-marked independence structures. The core algorithm iteratively combines causal discovery with state-space construction, enabling identifiable graphs per regime while preserving compatibility with existing CD methods like PC, FCI, and PCMCI variants. Through theoretical analysis and numerical experiments, the work demonstrates how locality and direct testing can scale to large graphs and complex regime structures, though it also outlines fundamental limits and the need for large sample sizes. The framework is modular, extensible to context patterns beyond time (e.g., space), and aims to provide interpretable, regime-specific causal graphs with practical applicability in domains such as climate science.

Abstract

Real-world data, for example in climate applications, often consists of spatially gridded time series data or data with comparable structure. While the underlying system is often believed to behave similar at different points in space and time, those variations that do exist are twofold relevant: They often encode important information in and of themselves. And they may negatively affect the stability / convergence and reliability\Slash{}validity of results of algorithms assuming stationarity or space-translation invariance. We study the information encoded in changes of the causal graph, with stability in mind. An analysis of this general task identifies two core challenges. We develop guiding principles to overcome these challenges, and provide a framework realizing these principles by modifying constraint-based causal discovery approaches on the level of independence testing. This leads to an extremely modular, easily extensible and widely applicable framework. It can leverage existing constraint-based causal discovery methods (demonstrated on IID-algorithms PC, PC-stable, FCI and time series algorithms PCMCI, PCMCI+, LPCMCI) with little to no modification. The built-in modularity allows to systematically understand and improve upon an entire array of subproblems. By design, it can be extended by leveraging insights from change-point-detection, clustering, independence-testing and other well-studied related problems. The division into more accessible sub-problems also simplifies the understanding of fundamental limitations, hyperparameters controlling trade-offs and the statistical interpretation of results. An open-source implementation will be available soon.

Paper Structure

This paper contains 54 sections, 5 theorems, 23 equations, 15 figures, 2 algorithms.

Key Result

lemma 1

The map $G: T \rightarrow \mathcal{G}, t \mapsto G(t)$ factors through $\sigma:T\rightarrow S$ ($G$ can be written as a function of $s\in S$), there is a unique mapping $G_{(s)}: S \rightarrow \mathcal{G}$ such that $G(t) = (G_{(s)} \circ \sigma)(t) := G_{(s)}(\sigma(t))$. We will by slight abuse o

Figures (15)

  • Figure 1.1: Simple toy-model to illustrate method-output (a; applied with FCI): Data is given on a spatial $100\times 100$ grid ( 10,000 data points in total), the link marked in orange is present in the southern half, the one marked in blue is present in the eastern half. This spatial distribution is unknownhidden. Our focus is on how the graph on the left-hand side (including the designation of changing links) can be recovered. In this example, different graphs correspond to (spatial) quadrants (b). While this illustrates the interpretation of output (a), the association of changing links to spatial directions is a very special case, see Fig. \ref{['fig:illustrate_global_vs_local_structure']}.
  • Figure 1.2: Illustration of complexity-scaling for local vs. global description of regimes over time. Locally there are two regimes (top panel), globally the number of regimes increases exponentially in the number of local changes (bottom panel). These global changes in the model can always be decomposed in terms of local changes, this does not require the existence of corresponding index-setspatial directions as those used in Fig. \ref{['fig:intro_toy_model']}b.
  • Figure 1.3: Illustration of direct vs. indirect graph-discovery. Green arrows require only the extraction of low-complexity knowledge, red arrows require the extraction of high-complexity knowledge. The information-content of time-resolved indicators will typically increase with larger sample-size (§\ref{['apdx:model_details']}), while the information contained in causal graphs does not. So the gradient in information-content (gray arrow), for larger data-sets, points steeply upwards.
  • Figure 1.4: Framework Architecture. Blue boxes are abstract concepts encoding knowledge (dashed: lazily evaluated). Arrows represent algorithmic components converting such knowledge (see §\ref{['sec:dynamic_cd']}); these components are highly modular: Conventionally one may combine different CD-algorithms with different CITs; in our framework individual components enjoy similar independence. By independence atoms we refer to individual independencies (no relations like d-separations are implied at that stage). The core idea is the introduction of a new abstraction of independence atoms that additionally encode regime-information. Its careful choice enables our framework.
  • Figure 5.1: Our framework turns a (graph-)global clustering problem into (graph-)local ones.
  • ...and 10 more figures

Theorems & Definitions (24)

  • definition 1: Indicators
  • definition 2: Non-Stationary Additive SCM
  • definition 3: Non-Stationary Parents
  • definition 4: Non-Stationary General SCM
  • definition 5: States
  • lemma 1: State-Factorization
  • definition 6: Independence Atoms
  • definition 7: Resolved Graphs
  • definition 8
  • definition 9: CD Correctness
  • ...and 14 more