Table of Contents
Fetching ...

Coverage-Guaranteed Prediction Sets for Out-of-Distribution Data

Xin Zou, Weiwei Liu

TL;DR

This work addresses confidence-set prediction under out-of-distribution (OOD) shifts, where standard split conformal prediction (SCP) fails due to violated exchangeability. The authors introduce an OOD-aware correction based on $f$-divergence between the target distribution and the convex hull of source domains, deriving a robust marginal-coverage guarantee and practical procedures for both population- and finite-sample settings. They provide a theoretical framework with explicit mechanisms (via $g_{f,\rho}$ and its inverse) and demonstrate the method's validity on simulated data, including multi-source scenarios. The approach offers a principled way to quantify and control uncertainty in high-stakes decisions when the test domain may differ from training domains, with concrete guidance on how to compute coverage thresholds and leverage different $f$-divergences.

Abstract

Out-of-distribution (OOD) generalization has attracted increasing research attention in recent years, due to its promising experimental results in real-world applications. In this paper,we study the confidence set prediction problem in the OOD generalization setting. Split conformal prediction (SCP) is an efficient framework for handling the confidence set prediction problem. However, the validity of SCP requires the examples to be exchangeable, which is violated in the OOD setting. Empirically, we show that trivially applying SCP results in a failure to maintain the marginal coverage when the unseen target domain is different from the source domain. To address this issue, we develop a method for forming confident prediction sets in the OOD setting and theoretically prove the validity of our method. Finally, we conduct experiments on simulated data to empirically verify the correctness of our theory and the validity of our proposed method.

Coverage-Guaranteed Prediction Sets for Out-of-Distribution Data

TL;DR

This work addresses confidence-set prediction under out-of-distribution (OOD) shifts, where standard split conformal prediction (SCP) fails due to violated exchangeability. The authors introduce an OOD-aware correction based on -divergence between the target distribution and the convex hull of source domains, deriving a robust marginal-coverage guarantee and practical procedures for both population- and finite-sample settings. They provide a theoretical framework with explicit mechanisms (via and its inverse) and demonstrate the method's validity on simulated data, including multi-source scenarios. The approach offers a principled way to quantify and control uncertainty in high-stakes decisions when the test domain may differ from training domains, with concrete guidance on how to compute coverage thresholds and leverage different -divergences.

Abstract

Out-of-distribution (OOD) generalization has attracted increasing research attention in recent years, due to its promising experimental results in real-world applications. In this paper,we study the confidence set prediction problem in the OOD generalization setting. Split conformal prediction (SCP) is an efficient framework for handling the confidence set prediction problem. However, the validity of SCP requires the examples to be exchangeable, which is violated in the OOD setting. Empirically, we show that trivially applying SCP results in a failure to maintain the marginal coverage when the unseen target domain is different from the source domain. To address this issue, we develop a method for forming confident prediction sets in the OOD setting and theoretically prove the validity of our method. Finally, we conduct experiments on simulated data to empirically verify the correctness of our theory and the validity of our proposed method.
Paper Structure (26 sections, 16 theorems, 94 equations, 5 figures)

This paper contains 26 sections, 16 theorems, 94 equations, 5 figures.

Key Result

Lemma 2

Assume that examples $\{ (X_i, Y_i) \}_{i=1}^{n+1}$ are exchangeable. For any nonconformity score $s(\cdot, \cdot)$ and any $\alpha \in (0,1)$, the prediction set defined in set::scp satisfies: where the probability is over the randomness of $\{ (X_i, Y_i) \}_{i=1}^{n+1}$.

Figures (5)

  • Figure 1: The box plots for the results of the $1000$ runs. We show the results for $\alpha=\{ 0.05, 0.1, 0.15, 0.2, 0.25, 0.3 \}$ and the horizontal axis represents the value of $\alpha$. The left plot shows the results for the coverage of the prediction sets. The red lines are the marginal coverage guarantees that we wish to achieve. The right plot shows the results for the length of the prediction sets.
  • Figure 2: The violin plots for the coverage of the $1000$ runs under the same data generation settings as in Section \ref{['sec::motivating-experiment']}. We show results for $\alpha=\{0.05, 0.1, 0.15, 0.2, 0.25, 0.3\}$. Here, the red lines are the marginal coverage guarantees that we wish to achieve. The white point represents the median, while the two endpoints of the thick line are the $0.25$ quantile and the $0.75$ quantile.
  • Figure 3: The violin plots for the coverage of the $1000$ runs for the multi-source OOD confidence set prediction task. We show results for $\alpha=\{0.05, 0.1, 0.15, 0.2, 0.25, 0.3\}$. Here, the red lines are the marginal coverage guarantees that we wish to achieve. The white point represents the median, while the two endpoints of the thick line are the $0.25$ quantile and the $0.75$ quantile.
  • Figure B.1: The violin plots for the average length of the $1000$ runs under the same data generation settings as in Section \ref{['sec::motivating-experiment']}. We show results for $\alpha=\{0.05, 0.1, 0.15, 0.2, 0.25, 0.3\}$. The white point represents the median, while the two endpoints of the thick line are the $0.25$ quantile and the $0.75$ quantile.
  • Figure B.2: The violin plots for the average length of the $1000$ runs for the multi-source OOD confidence set prediction task. We show results for $\alpha=\{0.05, 0.1, 0.15, 0.2, 0.25, 0.3\}$. The white point represents the median, while the two endpoints of the thick line are the $0.25$ quantile and the $0.75$ quantile.

Theorems & Definitions (41)

  • Definition 1: DBLP:journals/jmlr/ShaferV08
  • Lemma 2: The validity of SCP
  • Lemma 3
  • Definition 4: $f$-divergence
  • Remark 1
  • Lemma 5
  • Remark 2
  • Theorem 6
  • Proposition 7
  • Theorem 8: Marginal coverage guarantee for the empirical estimations
  • ...and 31 more