Synergizing Deconfounding and Temporal Generalization For Time-series Counterfactual Outcome Estimation
Yiling Liu, Juncheng Dong, Chen Fu, Wei Shi, Ziyang Jiang, Zhigang Hua, David Carlson
TL;DR
The paper tackles time-series counterfactual outcome estimation under time-varying confounding by introducing Sub-treatment Group Alignment (SGA) to achieve finer-grained deconfounding and Random Temporal Masking (RTM) to boost temporal generalization. The authors derive a tighter counterfactual risk bound via SGA and demonstrate that RTM encourages reliance on stable historical patterns, improving long-horizon predictions. Empirical results on fully synthetic PK-PD tumor growth and semi-synthetic MIMIC-based data show state-of-the-art performance, with ablations confirming complementary benefits of SGA and RTM. The framework is architecture-agnostic and can be integrated with CRN or CT, offering a practical and robust approach to causal inference in observational time series. Together, SGA and RTM provide a flexible, scalable method to better estimate time-series counterfactuals in the presence of time-varying confounding and evolving covariates.
Abstract
Estimating counterfactual outcomes from time-series observations is crucial for effective decision-making, e.g. when to administer a life-saving treatment, yet remains significantly challenging because (i) the counterfactual trajectory is never observed and (ii) confounders evolve with time and distort estimation at every step. To address these challenges, we propose a novel framework that synergistically integrates two complementary approaches: Sub-treatment Group Alignment (SGA) and Random Temporal Masking (RTM). Instead of the coarse practice of aligning marginal distributions of the treatments in latent space, SGA uses iterative treatment-agnostic clustering to identify fine-grained sub-treatment groups. Aligning these fine-grained groups achieves improved distributional matching, thus leading to more effective deconfounding. We theoretically demonstrate that SGA optimizes a tighter upper bound on counterfactual risk and empirically verify its deconfounding efficacy. RTM promotes temporal generalization by randomly replacing input covariates with Gaussian noises during training. This encourages the model to rely less on potentially noisy or spuriously correlated covariates at the current step and more on stable historical patterns, thereby improving its ability to generalize across time and better preserve underlying causal relationships. Our experiments demonstrate that while applying SGA and RTM individually improves counterfactual outcome estimation, their synergistic combination consistently achieves state-of-the-art performance. This success comes from their distinct yet complementary roles: RTM enhances temporal generalization and robustness across time steps, while SGA improves deconfounding at each specific time point.
