Table of Contents
Fetching ...

Robust Conformal Prediction under Distribution Shift via Physics-Informed Structural Causal Model

Rui Xu, Yue Sun, Chao Chen, Parv Venkitasubramaniam, Sihong Xie

TL;DR

Inspired by the invariance of physics across data distributions, a physics-informed structural causal model (PI-SCM) is proposed and validated that PI-SCM can improve coverage robustness along confidence level and test domain on a traffic speed prediction task and an epidemic spread task with multiple real-world datasets.

Abstract

Uncertainty is critical to reliable decision-making with machine learning. Conformal prediction (CP) handles uncertainty by predicting a set on a test input, hoping the set to cover the true label with at least $(1-α)$ confidence. This coverage can be guaranteed on test data even if the marginal distributions $P_X$ differ between calibration and test datasets. However, as it is common in practice, when the conditional distribution $P_{Y|X}$ is different on calibration and test data, the coverage is not guaranteed and it is essential to measure and minimize the coverage loss under distributional shift at \textit{all} possible confidence levels. To address these issues, we upper bound the coverage difference at all levels using the cumulative density functions of calibration and test conformal scores and Wasserstein distance. Inspired by the invariance of physics across data distributions, we propose a physics-informed structural causal model (PI-SCM) to reduce the upper bound. We validated that PI-SCM can improve coverage robustness along confidence level and test domain on a traffic speed prediction task and an epidemic spread task with multiple real-world datasets.

Robust Conformal Prediction under Distribution Shift via Physics-Informed Structural Causal Model

TL;DR

Inspired by the invariance of physics across data distributions, a physics-informed structural causal model (PI-SCM) is proposed and validated that PI-SCM can improve coverage robustness along confidence level and test domain on a traffic speed prediction task and an epidemic spread task with multiple real-world datasets.

Abstract

Uncertainty is critical to reliable decision-making with machine learning. Conformal prediction (CP) handles uncertainty by predicting a set on a test input, hoping the set to cover the true label with at least confidence. This coverage can be guaranteed on test data even if the marginal distributions differ between calibration and test datasets. However, as it is common in practice, when the conditional distribution is different on calibration and test data, the coverage is not guaranteed and it is essential to measure and minimize the coverage loss under distributional shift at \textit{all} possible confidence levels. To address these issues, we upper bound the coverage difference at all levels using the cumulative density functions of calibration and test conformal scores and Wasserstein distance. Inspired by the invariance of physics across data distributions, we propose a physics-informed structural causal model (PI-SCM) to reduce the upper bound. We validated that PI-SCM can improve coverage robustness along confidence level and test domain on a traffic speed prediction task and an epidemic spread task with multiple real-world datasets.
Paper Structure (25 sections, 37 equations, 8 figures, 2 tables)

This paper contains 25 sections, 37 equations, 8 figures, 2 tables.

Figures (8)

  • Figure 1: Reduction of Wasserstein distance by Physics-Informed Structural Causal Model (PI-SCM). After reducing the influence of different marginal distributions by importance weighting, the ($1-\alpha$) quantile of weighted calibration conformal scores is calculated as $V_q$. The difference, $|D|$, between the coverage on weighted calibration conformal scores, $\hat{\mathbb{P}}$, and the coverage on test conformal scores, $\mathbb{P}$, is calculated by their corresponding cumulative density function(CDF) at $V_q$. To evaluate the closeness of the CDFs along different confidence levels, Wasserstein distance scans $|D|$ along the quantile axis, showing the domain adaptation ability of a model. PI-SCM can capture more physical causality than data-driven models, thus leading to lower Wasserstein distance.
  • Figure 2: Absolute coverage divergences $|D|_{\alpha}$ along $(1-\alpha)$ confidence level of traffic speed prediction (top) and epidemic spread prediction (bottom). Models better fit PI-SCM introduce more physical causality and diverge less from expected coverage, thus showing better coverage robustness. The result is averaged over 10 runs.
  • Figure 3: Absolute coverage divergences $|D|_{t}$ along single-hour test sets of traffic speed prediction task. RD-UQ model guided by PI-SCM reduces high $|D|_{t}$ of RD-U model (like from 1:00 AM to 5:00 AM) to the low level of other hours. The result is averaged over 10 runs.
  • Figure 4: Absolute coverage divergences $|D|_{t}$ along pandemic intervals of epidemic speed prediction task. SIR model reduces $|D|_{t}$ in Initiation, Acceleration, and Subsidence intervals, whereas the improvement in Deceleration interval is not obvious. The result is averaged over 10 runs.
  • Figure 5: Reaction effect data from 7:00 AM to 8:00 AM of sensor ID: D005ES17288, whose $N^r$ has only 1 sensor for simpler presentation, in Seattle-loop dataset. (a) All $\triangle u_(i,j)$,$\triangle q_(i,j)$, and $\triangle u_i$ data of the sensor. (b) High-density traffic data, $k_i\in (k_2,\infty)$. (c) Medium-density traffic data, $k_i\in [k_1,k_2]$,(d) Low-density traffic data, $k_i\in [0,k_1)$. (a) presents $\triangle q_(i,j)$ is a strong indicator for $\triangle u_i$, whereas dependency of $\triangle u_i$ on $\triangle u_(i,j)$ is unclear without $\triangle q_(i,j)$ aid. Comparison of (b), (c), and (d) shows the correlation between $\triangle u_(i,j)$,$\triangle q_(i,j)$, and $\triangle u_i$ is stronger as traffic density,$k_i$, increases.
  • ...and 3 more figures

Theorems & Definitions (1)

  • Definition 1: Quantile