Table of Contents
Fetching ...

Partial Causal Structure Learning for Valid Selective Conformal Inference under Interventions

Amir Asiaee, Kavey Aryan, James P. Long

TL;DR

A contamination-robust conformal coverage theorem that quantifies how misclassification of"unaffected"calibration examples degrades coverage via an explicit function of the contamination fraction and calibration set size and a finite-sample lower bound that holds for arbitrary contaminating distributions is provided.

Abstract

Selective conformal prediction can yield substantially tighter uncertainty sets when we can identify calibration examples that are exchangeable with the test example. In interventional settings, such as perturbation experiments in genomics, exchangeability often holds only within subsets of interventions that leave a target variable "unaffected" (e.g., non-descendants of an intervened node in a causal graph). We study the practical regime where this invariance structure is unknown and must be learned from data. Our contributions are: (i) a contamination-robust conformal coverage theorem that quantifies how misclassification of "unaffected" calibration examples degrades coverage via an explicit function $g(δ,n)$ of the contamination fraction and calibration set size, providing a finite-sample lower bound that holds for arbitrary contaminating distributions; (ii) a task-driven partial causal learning formulation that estimates only the binary descendant indicators $Z_{a,i}=\mathbf{1}\{i\in\mathrm{desc}(a)\}$ needed for selective calibration, rather than the full causal graph; and (iii) algorithms for descendant discovery via perturbation intersection patterns (differentially affected variable set intersections across interventions), and for approximate distance-to-intervention estimation via local invariant causal prediction. We provide recovery conditions under which contamination is controlled. Experiments on synthetic linear structural equation models (SEMs) validate the bound: under controlled contamination up to $δ=0.30$, the corrected procedure maintains $\ge 0.95$ coverage while uncorrected selective CP degrades to $0.867$. A proof-of-concept on Replogle K562 CRISPR interference (CRISPRi) perturbation data demonstrates applicability to real genomic screens.

Partial Causal Structure Learning for Valid Selective Conformal Inference under Interventions

TL;DR

A contamination-robust conformal coverage theorem that quantifies how misclassification of"unaffected"calibration examples degrades coverage via an explicit function of the contamination fraction and calibration set size and a finite-sample lower bound that holds for arbitrary contaminating distributions is provided.

Abstract

Selective conformal prediction can yield substantially tighter uncertainty sets when we can identify calibration examples that are exchangeable with the test example. In interventional settings, such as perturbation experiments in genomics, exchangeability often holds only within subsets of interventions that leave a target variable "unaffected" (e.g., non-descendants of an intervened node in a causal graph). We study the practical regime where this invariance structure is unknown and must be learned from data. Our contributions are: (i) a contamination-robust conformal coverage theorem that quantifies how misclassification of "unaffected" calibration examples degrades coverage via an explicit function of the contamination fraction and calibration set size, providing a finite-sample lower bound that holds for arbitrary contaminating distributions; (ii) a task-driven partial causal learning formulation that estimates only the binary descendant indicators needed for selective calibration, rather than the full causal graph; and (iii) algorithms for descendant discovery via perturbation intersection patterns (differentially affected variable set intersections across interventions), and for approximate distance-to-intervention estimation via local invariant causal prediction. We provide recovery conditions under which contamination is controlled. Experiments on synthetic linear structural equation models (SEMs) validate the bound: under controlled contamination up to , the corrected procedure maintains coverage while uncorrected selective CP degrades to . A proof-of-concept on Replogle K562 CRISPR interference (CRISPRi) perturbation data demonstrates applicability to real genomic screens.
Paper Structure (48 sections, 5 theorems, 30 equations, 3 figures, 3 tables, 2 algorithms)

This paper contains 48 sections, 5 theorems, 30 equations, 3 figures, 3 tables, 2 algorithms.

Key Result

Theorem 1

Fix $(i,a^\star)$ with $Z_{a^\star,i}=0$, and suppose that scores from the "good" set $\mathcal{A}^\star$ are exchangeable with the test score $R_i^{(a^\star)}$ (Assumption ass:selective_exchange). Let $\widehat{\mathcal{A}}$ be any selected calibration set of size $n=|\widehat{\mathcal{A}}|$, and l In terms of the contamination fraction $\delta=(n-m)/n$, For large $n$, $g(\delta,n)\approx \delta

Figures (3)

  • Figure 1: Anatomy of the intervention sets for target gene $i$. (a) The calibration pool $\mathcal{A}_{\mathrm{cal}}$ (rectangle) contains all interventions except the test $a^\star$. The safe set $\mathcal{A}^\star(i)$ (green ellipse) consists of interventions that do not affect $i$; the selected set $\widehat{\mathcal{A}}$ (blue ellipse) is our classifier's estimate of $\mathcal{A}^\star(i)$. Their overlap produces three regions: good (selected and truly safe; $m$ exchangeable scores), bad (selected but actually affecting $i$; $n\!-\!m$ contaminants that degrade coverage by $g(\delta,n)$), and missed (truly safe but not selected; reduces $n$ but does not harm coverage). (b) Example DAG with five calibration interventions and target gene $i$. $\mathrm{desc}(a_1)\!=\!\{g,i\}$, $\mathrm{desc}(a_4)\!=\!\{g,h,i\}$ (both affect $i$); $\mathrm{desc}(a_2)\!=\!\{h\}$, $\mathrm{desc}(a_3)\!=\!\{h\}$, $\mathrm{desc}(a_5)\!=\!\{k\}$ (safe for $i$). The classifier selects $\widehat{\mathcal{A}}=\{a_2,a_3,a_4\}$: $a_2,a_3$ are good, $a_4$ is bad (FP, $\delta\!=\!1/3$), and $a_5$ is missed (FN).
  • Figure 2: Coverage vs. injected contamination $\delta$. Estimated (blue) degrades monotonically from $0.905$ to $0.867$; Corrected (orange) remains above $0.95$ for $\delta\ge 0.05$; Oracle (green) and Pooled (red) are unaffected. Dashed line: nominal $1-\alpha=0.9$.
  • Figure 3: Gap between empirical coverage and the theoretical lower bound from Theorem \ref{['thm:delta']}. All values are non-negative for the selective methods (Oracle, Estimated, Corrected), confirming the bound is valid. Pooled shows a small negative gap ($\approx -0.004$) because it uses all calibration points without selection, so the selective coverage theorem does not apply. The gap grows with $\delta$ because worst-case adversarial contamination does not occur in practice.

Theorems & Definitions (15)

  • Example 1: Why selective calibration tightens intervals
  • Definition 1: Contamination fraction
  • Theorem 1: $\delta$-robust selective conformal coverage
  • Remark 1: Interpretation and how to use it
  • Remark 2: Comparison with Huber contamination bounds
  • Remark 3: Benign contamination
  • Corollary 1: Corrected selective conformal
  • Remark 4: Asymmetric error costs
  • Proposition 1: Intersection candidates are supersets
  • Proposition 2: False positive control via upstream intersections
  • ...and 5 more