Estimating counterfactual treatment outcomes over time in complex multiagent scenarios

Keisuke Fujii; Koh Takeuchi; Atsushi Kuribayashi; Naoya Takeishi; Yoshinobu Kawahara; Kazuya Takeda

Estimating counterfactual treatment outcomes over time in complex multiagent scenarios

Keisuke Fujii, Koh Takeuchi, Atsushi Kuribayashi, Naoya Takeishi, Yoshinobu Kawahara, Kazuya Takeda

TL;DR

The paper tackles the problem of estimating time-varying individual treatment effects (ITE) in complex, multiagent settings with hidden confounders, proposing a novel Theory-based Graph Variational Counterfactual Recurrent Network (TGV-CRN). The model integrates graph variational RNNs (GVRNN) for local inter-agent dynamics with theory-based global computations to enable long-term, interpretable counterfactual predictions of covariates and outcomes. Key contributions include the development of an interpretable counterfactual recurrent framework, incorporation of domain knowledge through theory-based functions, and empirical validation on synthetic CARLA and Boid data as well as real NBA basketball data, showing improved covariate prediction and more accurate timely interventions. This approach advances multiagent causal inference by providing actionable insights for when and how interventions are effective in real-world complex systems such as autonomous driving, biology, and sports.

Abstract

Evaluation of intervention in a multiagent system, e.g., when humans should intervene in autonomous driving systems and when a player should pass to teammates for a good shot, is challenging in various engineering and scientific fields. Estimating the individual treatment effect (ITE) using counterfactual long-term prediction is practical to evaluate such interventions. However, most of the conventional frameworks did not consider the time-varying complex structure of multiagent relationships and covariate counterfactual prediction. This may lead to erroneous assessments of ITE and difficulty in interpretation. Here we propose an interpretable, counterfactual recurrent network in multiagent systems to estimate the effect of the intervention. Our model leverages graph variational recurrent neural networks and theory-based computation with domain knowledge for the ITE estimation framework based on long-term prediction of multiagent covariates and outcomes, which can confirm the circumstances under which the intervention is effective. On simulated models of an automated vehicle and biological agents with time-varying confounders, we show that our methods achieved lower estimation errors in counterfactual covariates and the most effective treatment timing than the baselines. Furthermore, using real basketball data, our methods performed realistic counterfactual predictions and evaluated the counterfactual passes in shot scenarios.

Estimating counterfactual treatment outcomes over time in complex multiagent scenarios

TL;DR

Abstract

Paper Structure (24 sections, 1 theorem, 14 equations, 11 figures, 4 tables)

This paper contains 24 sections, 1 theorem, 14 equations, 11 figures, 4 tables.

Introduction
Background
Preliminary
Assumptions
Variational recurrent and graph neural networks
Proposed Method
Representation learning of confounders
Prediction with learned representation
Loss function
Related work
Experiments
Synthetic datasets
Real-world basketball dataset
Conclusions
A proof of Theorem \ref{['thm:identification']}
...and 9 more sections

Key Result

Theorem 1

If we recover $p(z_t^{(i)}|x_t^{(i)},\mathcal{H}_t^{(i)})$ and $p(y_{t+1}^{(i)}|z_t^{(i)},a_t^{(i)})$, then the proposed methods can recover the ITE under the causal graph in Fig. fig:causalgraph.

Figures (11)

Figure 1: The illustrations of our problems. Interventions in (A) an autonomous vehicle simulation, (B) a biological agent simulation, and (C) a real basketball are shown. In (A) and (C), a single agent (A: an ego-vehicle and C: a ball player) is intervened whereas multiple agents (all boids) are intervened in (B). The motivations and variable definitions are described in the Introduction and Background sections. In short, we aim to perform long-term counterfactual prediction of outcomes and covariates from the past covariates and intervention (or treatment assignment). In (A), our approach can test autonomous driving software including human interventions without creating the same situations for controlled trials (as real-world scenarios). The outcome is a safe driving distance. In (B), our approach has the potential to estimate the effect of an experimenter’s interventions on multi-animal behaviors, which improves the efficiency of experimental procedures for observing desired movements. The outcome is the angular velocity of multiple agents. In (C), our approach can estimate the effect of the selection of passes in basketball shot scenarios, thus we can evaluate the decision-making skills in this situation (e.g., during a game). The outcome is the effectiveness of an attack.
Figure 2: The illustration of causal graphs for our problem. We denote $X_t, Z_t, A_t, Y_{t+1}$ as the dynamic covariates, representations of hidden confounders, treatment assignment, and outcomes, respectively. The black lines indicate the causal relations. The hidden confounders $Z_{t+1}$ usually affect the treatment assignment $A_{t+1}$, the outcome $Y_{t+2}$, and the covariate $X_{t}$. To infer $Z_{t+1}$, we can leverage the observational data $X_{t+1}$ and previous hidden counfounders $Z_{t}$.
Figure 3: The illustration of TGV-CRN. (A) TGV-CRN aims to estimate ITE based on long-term prediction of multiagent covariates and outcomes while visualizing the long-term future covariate prediction. TGV-CRN leverages GVRNN to represent local agent interactions and theory-based functions for covariate and outcome prediction, which can confirm under what circumstances the intervention is effective. Specifically, (B) the training and inference processes of GNN encoder, prior, and decoder are illustrated. At each time stamp, the model takes the current covariates and treatment assignments as input to learn representations of the hidden confounders via GRNNs and GNN encoders. Then, via theory-based computations, the GNN decoders, and MLPs (multi-layer perceptron), the model predicts time-varying covariates, a potential outcome, and a treatment. We also use the gradient reversal layer before the treatment classifier to ensure the confounder representation distribution of the treated and that of the controlled are similar at the group level.
Figure 4: Illustration of gradient reversal layer (GRL). The diagram is a part of Fig. \ref{['fig:model']} ignoring covariate prediction and includes backward passes. $\theta_g$, $\theta_a$, and $\theta_y$ are the neural network parameters of the GNN encoder, MLP for treatment and outcome prediction (if $f^y_{theory}$ is a neural network). GRL multiplies the gradient by a certain negative constant during the backpropagation-based training.
Figure 5: Example CARLA results using our method. (Top) Visualization of covariates and (middle row and bottom) outcome time series in (left) ground truth without intervention, (middle column) counterfactual intervention using our model, and (right) the baseline. The middle row subfigures are enlarged views of the bottom ones from 20 s. An ego car (red square) and obstacles (black) are shown in the upper plots (see also Fig. \ref{['fig:intervention']}A) at the intervention time, which is the solid line in the lower plots. The unfilled circle is the start of the long-term prediction (dashed line in the lower plots). The ego-car moves from right to left and stops because of the obstacles.The videos are given in the above GitHub page.
...and 6 more figures

Theorems & Definitions (3)

Definition 1: Sequential Strong Ignorability
Theorem 1: Identification of ITE
proof

Estimating counterfactual treatment outcomes over time in complex multiagent scenarios

TL;DR

Abstract

Estimating counterfactual treatment outcomes over time in complex multiagent scenarios

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (11)

Theorems & Definitions (3)