Table of Contents
Fetching ...

Counterfactual Forecasting For Panel Data

Navonil Deb, Raaz Dwivedi, Sumanta Basu

TL;DR

FOCUS tackles counterfactual forecasting in panel data with missing entries by embedding stochastic dynamics into a low-rank factor model and forecasting through a VAR(1) on the latent factors. It combines PCA-based factor estimation with time-series forecasting to produce out-of-sample counterfactual means, providing nonasymptotic error bounds and asymptotic normality, along with valid confidence intervals. Empirical results on simulations and the HeartSteps mobile-health study show that leveraging autoregressive latent dynamics yields more accurate counterfactual forecasts than benchmark methods, enabling prospective decision-making under interventions. The approach offers a principled, scalable framework for forecasting counterfactuals in settings with missing data and temporally dependent latent structure, with potential extensions to nonstationary dynamics and doubly robust estimators.

Abstract

We address the challenge of forecasting counterfactual outcomes in a panel data with missing entries and temporally dependent latent factors -- a common scenario in causal inference, where estimating unobserved potential outcomes ahead of time is essential. We propose Forecasting Counterfactuals under Stochastic Dynamics (FOCUS), a method that extends traditional matrix completion methods by leveraging time series dynamics of the factors, thereby enhancing the prediction accuracy of future counterfactuals. Building upon a PCA estimator, our method accommodates both stochastic and deterministic components within the factors, and provides a flexible framework for various applications. In case of stationary autoregressive factors and under standard conditions, we derive error bounds and establish asymptotic normality of our estimator. Empirical evaluations demonstrate that our method outperforms existing benchmarks when the latent factors have an autoregressive component. We illustrate FOCUS results on HeartSteps, a mobile health study, illustrating its effectiveness in forecasting step counts for users receiving activity prompts, thereby leveraging temporal patterns in user behavior.

Counterfactual Forecasting For Panel Data

TL;DR

FOCUS tackles counterfactual forecasting in panel data with missing entries by embedding stochastic dynamics into a low-rank factor model and forecasting through a VAR(1) on the latent factors. It combines PCA-based factor estimation with time-series forecasting to produce out-of-sample counterfactual means, providing nonasymptotic error bounds and asymptotic normality, along with valid confidence intervals. Empirical results on simulations and the HeartSteps mobile-health study show that leveraging autoregressive latent dynamics yields more accurate counterfactual forecasts than benchmark methods, enabling prospective decision-making under interventions. The approach offers a principled, scalable framework for forecasting counterfactuals in settings with missing data and temporally dependent latent structure, with potential extensions to nonstationary dynamics and doubly robust estimators.

Abstract

We address the challenge of forecasting counterfactual outcomes in a panel data with missing entries and temporally dependent latent factors -- a common scenario in causal inference, where estimating unobserved potential outcomes ahead of time is essential. We propose Forecasting Counterfactuals under Stochastic Dynamics (FOCUS), a method that extends traditional matrix completion methods by leveraging time series dynamics of the factors, thereby enhancing the prediction accuracy of future counterfactuals. Building upon a PCA estimator, our method accommodates both stochastic and deterministic components within the factors, and provides a flexible framework for various applications. In case of stationary autoregressive factors and under standard conditions, we derive error bounds and establish asymptotic normality of our estimator. Empirical evaluations demonstrate that our method outperforms existing benchmarks when the latent factors have an autoregressive component. We illustrate FOCUS results on HeartSteps, a mobile health study, illustrating its effectiveness in forecasting step counts for users receiving activity prompts, thereby leveraging temporal patterns in user behavior.

Paper Structure

This paper contains 42 sections, 16 theorems, 109 equations, 6 figures.

Key Result

Theorem 4.1

Consider a factor model eq:factor_model_1 with $N$ units and $T$ time points satisfying Assum. asn:var1_factors to asn:observation. Then for the Focus estimator $\hat{\theta}_{i,T : T + h}$ in eq:forecast_target_est, any fixed unit $i \in [N]$ and forecast horizon $h\ge 1$, the absolute error associ

Figures (6)

  • Figure 1: Mean squared forecast error (MSFE, averaged over 30 trials) across the benchmarks for $N = 64$ and three generative models. Panels (a)-(d) present the average MSFE of Focus (blue triangle), mSSA (orange circle) and SyNBEATS (green diamond) across $T \in \{2^5, 2^6, 2^7, 2^8\}$, and the vertical lines mark the one standard deviation error bars. As comapared to SyNBEATS and mSSA, Focus has lower average MSFE that decreases faster with $T$ (empirical rates in the legends). Panels (e)-(h) present scatter plots of difference of MSFE (Focus- Benchmark method) for $T = 128$. The errors of Focus are significantly lower (p-values of Wilcoxon's one-sided pairwise test in legends < 0.01), resulting the scatter plots concentrated in $y>x$ region.
  • Figure 2: Results of Focus and mSSA for HeartSteps data. Panel (a) presents a scatter plot of estimated factors for slot pair (4, 5) that shows strong temporal correlation 0.62 for $T = 170$, highlighting the predictive power leveraged by Focus. Panels (b) and (c) present the point-wise prediction error for mSSA and Focus at $T = 170, 200$ and for users with positive steps at $T+h$. For most users, Focus outputs (blue triangle) are closer to the horizontal line at 0 than mSSA (orange circle), yielding more accurate prediction. We note that for both methods, most users have negative prediction errors. Panel (c) presents difference of MSRPE (Focus - mSSA) for users with positive steps at $T+h$ and across different $T$. As $T$ grows, the difference stays under the horizontal line ar 0 and decreases with $T$ -- indicating an empirically better performance of Focus with increasing $T$.
  • Figure 3: Additional figures for MSFE (averaged over 30 trials) for $h = 2$ across the benchmarks for $N = 64$ and three generative models. Panels (a) and (b) present the average MSFE of Focus (blue triangle), mSSA (orange circle) and SyNBEATS (green diamond) across $T \in \{2^5, 2^6, 2^7, 2^8\}$, and the vertical lines mark the one standard deviation error bars. As comapared to SyNBEATS and mSSA, Focus has lower average MSFE that decreases faster with $T$ (empirical rates in the legends). Panels (c) and (d) present scatter plots of difference of MSFE (Focus- Benchmark method) for $T = 128$ and $h = 2$. The errors of Focus are significantly lower (p-values of Wilcoxon's one-sided pairwise test in legends < 0.01), resulting the scatter plots concentrated in $y>x$ region.
  • Figure 4: Additional figures for MSFE (averaged over 30 trials) for $h = 3$ across the benchmarks for $N = 64$ and three generative models. Panels (a) and (b) present the average MSFE of Focus (blue triangle), mSSA (orange circle) and SyNBEATS (green diamond) across $T \in \{2^5, 2^6, 2^7, 2^8\}$, and the vertical lines mark the one standard deviation error bars. As comapared to SyNBEATS and mSSA, Focus has lower average MSFE that decreases faster with $T$ (empirical rates in the legends). Panels (c) and (d) present scatter plots of difference of MSFE (Focus- Benchmark method) for $T = 128$ and $h = 3$. The errors of Focus are significantly lower (p-values of Wilcoxon's one-sided pairwise test in legends < 0.01), resulting the scatter plots concentrated in $y>x$ region.
  • Figure 5: $\log(1 + \texttt{jbsteps30})$ vs time for three users in HeartSteps data. The 5 decision slots each day are marked by the dashed blue vertical lines. The green (and red) dots represent that the user was nudged (not nudged). Between consecutive slots, the steps exhibit a negative correlation shared across users.
  • ...and 1 more figures

Theorems & Definitions (20)

  • Remark 4.1: Validity of the assumption
  • Theorem 4.1: Error bound for $\hat{\theta}_{i,T : T + h}$
  • Remark 4.2: Role of $h$
  • Theorem 4.2: Asymptotic normality of $\hat{\theta}_{i,T : T + h}$
  • Corollary 4.1: Asymptotic C.I. for $\hat{\theta}_{i,T : T + h}$
  • Corollary 4.2: Focus under MCAR and staggered adoption
  • Lemma A.1: Error bound on $\hat{A}$
  • Remark A.1
  • Lemma B.1: Asymptotic normality of $\hat{A}$
  • Lemma B.2: Asymptotic independence
  • ...and 10 more