Table of Contents
Fetching ...

TraCE: Trajectory Counterfactual Explanation Scores

Jeffrey N. Clark, Edward A. Small, Nawid Keshtmand, Michelle W. L. Wan, Elena Fillola Mayoral, Enrico Werner, Christopher P. Bourdeaux, Raul Santos-Rodriguez

TL;DR

TraCE introduces a model-agnostic framework that condenses progress in sequential decisions into a single interpretable score by leveraging counterfactual trajectories. The score $S\in[-1,1]$ combines angle and distance alignments via $S(x_t,x'_t)=\lambda R_1(x_t,x'_t)+(1-\lambda)R_2(x_t,x'_t)$, enabling benchmarking across time and multiple counterfactual targets with a simple interpretation. The authors validate TraCE in two domains: ICU patient trajectories using MIMIC data and monitoring of global development against SSP projections, showing discriminative power between outcomes and useful insights for real-time decision support. This work suggests a path toward standardized, explainable progress metrics for complex sequential tasks, with future work on feature weighting, higher-dimensional data, and deployment in clinical and policy contexts.

Abstract

Counterfactual explanations, and their associated algorithmic recourse, are typically leveraged to understand, explain, and potentially alter a prediction coming from a black-box classifier. In this paper, we propose to extend the use of counterfactuals to evaluate progress in sequential decision making tasks. To this end, we introduce a model-agnostic modular framework, TraCE (Trajectory Counterfactual Explanation) scores, which is able to distill and condense progress in highly complex scenarios into a single value. We demonstrate TraCE's utility across domains by showcasing its main properties in two case studies spanning healthcare and climate change.

TraCE: Trajectory Counterfactual Explanation Scores

TL;DR

TraCE introduces a model-agnostic framework that condenses progress in sequential decisions into a single interpretable score by leveraging counterfactual trajectories. The score combines angle and distance alignments via , enabling benchmarking across time and multiple counterfactual targets with a simple interpretation. The authors validate TraCE in two domains: ICU patient trajectories using MIMIC data and monitoring of global development against SSP projections, showing discriminative power between outcomes and useful insights for real-time decision support. This work suggests a path toward standardized, explainable progress metrics for complex sequential tasks, with future work on feature weighting, higher-dimensional data, and deployment in clinical and policy contexts.

Abstract

Counterfactual explanations, and their associated algorithmic recourse, are typically leveraged to understand, explain, and potentially alter a prediction coming from a black-box classifier. In this paper, we propose to extend the use of counterfactuals to evaluate progress in sequential decision making tasks. To this end, we introduce a model-agnostic modular framework, TraCE (Trajectory Counterfactual Explanation) scores, which is able to distill and condense progress in highly complex scenarios into a single value. We demonstrate TraCE's utility across domains by showcasing its main properties in two case studies spanning healthcare and climate change.
Paper Structure (19 sections, 1 theorem, 17 equations, 8 figures)

This paper contains 19 sections, 1 theorem, 17 equations, 8 figures.

Key Result

Theorem 1

Given $a,b,c\in\mathbbm{R}^n$, the closest point $d$ to $a$ in the vector direction $c-b$ is: where $h=c-b$, $g=a-b$ and $\theta = \frac{\langle h \; , \; g \rangle}{\lVert h \rVert \lVert g \rVert}$.

Figures (8)

  • Figure 1: TraCE for 2-D toy data set classification with three classes: light orange (current class), blue (desired class), and red (undesired class). The factual, $x$, moves over the sequence, as do the respective target counterfactual points (stars). Between segments of the true trajectory (e.g. $x_1$, $x_2$) TraCE measures alignment in angle, $R_1$, and the "best move" given the angle, $R_2$, with respect to counterfactual target points (stars in the left panel). In this example the TraCE score for moving from $x_0$ to $x_1$ is negative (-0.1855) because it aligns more with the negative counterfactual (red class), whereas the trajectory from $x_1$ to $x_2$ is away from the negative counterfactual and towards the positive counterfactual (blue class) hence the positive score (0.4056).
  • Figure 2: Contrasting example patient journeys. For each, top: instantaneous TraCE scores, higher indicates more alignment with the specified counterfactuals. 'Desirable' refers to alignment with successful discharge counterfactuals, 'Undesirable' refers to mortality counterfactuals. TraCE is computed on the current and preceding time point, hence time point 0 is not presented. Bottom: Classifier probabilities via the prediction model. Values in the legends are averages across the whole trajectory. NRFD = Not ready for discharge, RFD = ready for discharge (desirable outcome), mortality (undesirable outcome).
  • Figure 3: Average TraCE scores for each SSP for 15 countries, across the period 2015-2022. Higher TraCE scores indicate closer alignment with a Shared Socioeconomic Pathway (SSP).
  • Figure 4: Cumulative monthly SSP TraCE scores for Norway, 2015-2022.
  • Figure A.1: Geometric image of the proof for Theorem 1.
  • ...and 3 more figures

Theorems & Definitions (2)

  • Theorem 1
  • proof