Table of Contents
Fetching ...

Optimized Conformal Selection: Powerful Selective Inference After Conformity Score Optimization

Tian Bai, Ying Jin

TL;DR

This paper presents OptCS, a general framework that allows valid statistical testing after flexible data-driven model optimization, and introduces general conditions under which OptCS constructs valid conformal p-values despite substantial data reuse and handles complex p-value dependencies to maintain finite-sample FDR control via a novel multiple testing procedure.

Abstract

Model selection/optimization in conformal inference is challenging, since it may break the exchangeability between labeled and unlabeled data. We study this problem in the context of conformal selection, which uses conformal p-values to select ``interesting'' instances with large unobserved labels from a pool of unlabeled data, while controlling the FDR in finite sample. For validity, existing solutions require the model choice to be independent of the data used to construct the p-values and calibrate the selection set. However, when presented with many model choices and limited labeled data, it is desirable to (i) select the best model in a data-driven manner, and (ii) mitigate power loss due to sample splitting. This paper presents OptCS, a general framework that allows valid statistical testing (selection) after flexible data-driven model optimization. We introduce general conditions under which OptCS constructs valid conformal p-values despite substantial data reuse and handles complex p-value dependencies to maintain finite-sample FDR control via a novel multiple testing procedure. We instantiate this general recipe to propose three FDR-controlling procedures, each optimizing the models differently: (i) selecting the most powerful one among multiple pre-trained candidate models, (ii) using all data for model fitting without sample splitting, and (iii) combining full-sample model fitting and selection. We demonstrate the efficacy of our methods via simulation studies and real applications in drug discovery and alignment of large language models in radiology report generation.

Optimized Conformal Selection: Powerful Selective Inference After Conformity Score Optimization

TL;DR

This paper presents OptCS, a general framework that allows valid statistical testing after flexible data-driven model optimization, and introduces general conditions under which OptCS constructs valid conformal p-values despite substantial data reuse and handles complex p-value dependencies to maintain finite-sample FDR control via a novel multiple testing procedure.

Abstract

Model selection/optimization in conformal inference is challenging, since it may break the exchangeability between labeled and unlabeled data. We study this problem in the context of conformal selection, which uses conformal p-values to select ``interesting'' instances with large unobserved labels from a pool of unlabeled data, while controlling the FDR in finite sample. For validity, existing solutions require the model choice to be independent of the data used to construct the p-values and calibrate the selection set. However, when presented with many model choices and limited labeled data, it is desirable to (i) select the best model in a data-driven manner, and (ii) mitigate power loss due to sample splitting. This paper presents OptCS, a general framework that allows valid statistical testing (selection) after flexible data-driven model optimization. We introduce general conditions under which OptCS constructs valid conformal p-values despite substantial data reuse and handles complex p-value dependencies to maintain finite-sample FDR control via a novel multiple testing procedure. We instantiate this general recipe to propose three FDR-controlling procedures, each optimizing the models differently: (i) selecting the most powerful one among multiple pre-trained candidate models, (ii) using all data for model fitting without sample splitting, and (iii) combining full-sample model fitting and selection. We demonstrate the efficacy of our methods via simulation studies and real applications in drug discovery and alignment of large language models in radiology report generation.

Paper Structure

This paper contains 48 sections, 7 theorems, 94 equations, 16 figures, 3 tables, 2 algorithms.

Key Result

Theorem 3.2

Suppose for all $j\in [m]$, the following conditions hold: Then, for any nominal FDR level $q \in (0,1)$ and any pruning method, the output ${\mathcal{S}}$ of OptCS obeys FDR control eq:fdr in finite sample, where the expectation is taken over both the calibration and test data.

Figures (16)

  • Figure 1: Overview of OptCS. (Optional) Split labeled data into preparatory and calibration sets. 1. P-value. We construct conformal p-values $\{p_j\}$ based on conformity scores from any data-dependent model optimization process $\mathcal{V}(\cdot)$ obeying a mild permutation-equivariance condition. 2. Selection threshold. We calibrate individual thresholds $\widehat{R}_j$ obeying a mild permutation-invariance condition. 3. Multiple testing. We combine $\{p_j\}$ and $\{\widehat{R}_j\}$ to produce a selection set $\mathcal{S}$ with finite-sample, distribution-free FDR control.
  • Figure 2: Preview of numerical results. Left: Performance of OptCS-MSel which selects from pre-trained models. Middle: Performance of OptCS-Full which leverages full data for model fitting with a given model class. Right: Performance of OptCS-Full-MSel which combines model selection and full-sample training.
  • Figure 3: Visualization of a score-generating functional.
  • Figure 4: OptCS-MSel modifies the naive approach in Section \ref{['subsec:challenges']} by individual model selection for each test point, replacing SCS in the evaluation step with a similar quantity $\mathcal{S}_j(\cdot)$ that is permutation invariant to the calibration data and the $j$-th test point. The selected models are used to calibrate the final selection set.
  • Figure 5: OptCS-Full-MSel uses all labeled data for model training (with ideas similar to OptCS-Full), model selection (with ideas similar to OptCS-MSel), and final multiple testing with selected models.
  • ...and 11 more figures

Theorems & Definitions (15)

  • Definition 1
  • Definition 2: Score-generating functional
  • Remark 3.1
  • Definition 3: Monotonicity for the null
  • Definition 4: Permutation equivariance
  • Definition 5: Permutation invariance under the null
  • Theorem 3.2
  • Lemma 3.3: FDR decomposition
  • Remark 3.4: Design principle of $\widehat{R}_j$
  • Proposition 4.1
  • ...and 5 more