Context-Specific Causal Graph Discovery with Unobserved Contexts: Non-Stationarity, Regimes and Spatio-Temporal Patterns
Martin Rabel, Jakob Runge
TL;DR
This paper tackles learning context-specific causal graphs from non-stationary data by formulating multi-valued causal discovery (MCD) and introducing a modular, locality-focused framework (gLD) that directly tests regime-dependent independencies. It extends standard SCMs to non-stationary settings using regime indicators, defines a state-space of potential model changes, and reframes constraint-based causal discovery to operate over regime-marked independence structures. The core algorithm iteratively combines causal discovery with state-space construction, enabling identifiable graphs per regime while preserving compatibility with existing CD methods like PC, FCI, and PCMCI variants. Through theoretical analysis and numerical experiments, the work demonstrates how locality and direct testing can scale to large graphs and complex regime structures, though it also outlines fundamental limits and the need for large sample sizes. The framework is modular, extensible to context patterns beyond time (e.g., space), and aims to provide interpretable, regime-specific causal graphs with practical applicability in domains such as climate science.
Abstract
Real-world data, for example in climate applications, often consists of spatially gridded time series data or data with comparable structure. While the underlying system is often believed to behave similar at different points in space and time, those variations that do exist are twofold relevant: They often encode important information in and of themselves. And they may negatively affect the stability / convergence and reliability\Slash{}validity of results of algorithms assuming stationarity or space-translation invariance. We study the information encoded in changes of the causal graph, with stability in mind. An analysis of this general task identifies two core challenges. We develop guiding principles to overcome these challenges, and provide a framework realizing these principles by modifying constraint-based causal discovery approaches on the level of independence testing. This leads to an extremely modular, easily extensible and widely applicable framework. It can leverage existing constraint-based causal discovery methods (demonstrated on IID-algorithms PC, PC-stable, FCI and time series algorithms PCMCI, PCMCI+, LPCMCI) with little to no modification. The built-in modularity allows to systematically understand and improve upon an entire array of subproblems. By design, it can be extended by leveraging insights from change-point-detection, clustering, independence-testing and other well-studied related problems. The division into more accessible sub-problems also simplifies the understanding of fundamental limitations, hyperparameters controlling trade-offs and the statistical interpretation of results. An open-source implementation will be available soon.
