Use of Prior Knowledge to Discover Causal Additive Models with Unobserved Variables and its Application to Time Series Data
Takashi Nicholas Maeda, Shohei Shimizu
TL;DR
This work addresses causal discovery in the presence of latent confounders for i.i.d. and time-series data by proposing CAM-UV, a generalized additive model (GAM)-based causal framework that accommodates unobserved variables, and its time-series extension TS-CAM-UV. It introduces CAM-UV-PK to inject prior knowledge (e.g., prohibiting certain causal directions) and TS-CAM-UV to exploit time-ordering constraints, leveraging a CAM-UV-PK backbone. The identifiability analysis clarifies when direct causal relations can be recovered under nonlinearity and latent confounding, and when only the presence of unobserved paths can be detected. Empirical results on simulated data show that prior knowledge improves precision and F-measure, while TS-CAM-UV demonstrates favorable performance relative to LPCMCI and VarLiNGAM on both simulated and real foreign-exchange time-series data, highlighting its ability to handle nonlinearities and latent confounding in time-series contexts. The proposed methods enable robust causal inference in complex systems where unobserved variables and nonlinear interactions are present, with practical applicability to time-series analysis.
Abstract
This paper proposes two methods for causal additive models with unobserved variables (CAM-UV). CAM-UV assumes that the causal functions take the form of generalized additive models and that latent confounders are present. First, we propose a method that leverages prior knowledge for efficient causal discovery. Then, we propose an extension of this method for inferring causality in time series data. The original CAM-UV algorithm differs from other existing causal function models in that it does not seek the causal order between observed variables, but rather aims to identify the causes for each observed variable. Therefore, the first proposed method in this paper utilizes prior knowledge, such as understanding that certain variables cannot be causes of specific others. Moreover, by incorporating the prior knowledge that causes precedes their effects in time, we extend the first algorithm to the second method for causal discovery in time series data. We validate the first proposed method by using simulated data to demonstrate that the accuracy of causal discovery increases as more prior knowledge is accumulated. Additionally, we test the second proposed method by comparing it with existing time series causal discovery methods, using both simulated data and real-world data.
