Table of Contents
Fetching ...

DANCE: Doubly Adaptive Neighborhood Conformal Estimation

Brandon R. Feng, Brian J. Reich, Daniel Beaglehole, Xihaier Luo, David Keetae Park, Shinjae Yoo, Zhechao Huang, Xueyu Mao, Olcay Boz, Jungeum Kim

TL;DR

DANCE is proposed, a doubly locally adaptive nearest-neighbor based conformal algorithm combining two novel nonconformity scores directly using the data's embedded representation to produce the final prediction sets for uncertainty quantification.

Abstract

The recent developments of complex deep learning models have led to unprecedented ability to accurately predict across multiple data representation types. Conformal prediction for uncertainty quantification of these models has risen in popularity, providing adaptive, statistically-valid prediction sets. For classification tasks, conformal methods have typically focused on utilizing logit scores. For pre-trained models, however, this can result in inefficient, overly conservative set sizes when not calibrated towards the target task. We propose DANCE, a doubly locally adaptive nearest-neighbor based conformal algorithm combining two novel nonconformity scores directly using the data's embedded representation. DANCE first fits a task-adaptive kernel regression model from the embedding layer before using the learned kernel space to produce the final prediction sets for uncertainty quantification. We test against state-of-the-art local, task-adapted and zero-shot conformal baselines, demonstrating DANCE's superior blend of set size efficiency and robustness across various datasets.

DANCE: Doubly Adaptive Neighborhood Conformal Estimation

TL;DR

DANCE is proposed, a doubly locally adaptive nearest-neighbor based conformal algorithm combining two novel nonconformity scores directly using the data's embedded representation to produce the final prediction sets for uncertainty quantification.

Abstract

The recent developments of complex deep learning models have led to unprecedented ability to accurately predict across multiple data representation types. Conformal prediction for uncertainty quantification of these models has risen in popularity, providing adaptive, statistically-valid prediction sets. For classification tasks, conformal methods have typically focused on utilizing logit scores. For pre-trained models, however, this can result in inefficient, overly conservative set sizes when not calibrated towards the target task. We propose DANCE, a doubly locally adaptive nearest-neighbor based conformal algorithm combining two novel nonconformity scores directly using the data's embedded representation. DANCE first fits a task-adaptive kernel regression model from the embedding layer before using the learned kernel space to produce the final prediction sets for uncertainty quantification. We test against state-of-the-art local, task-adapted and zero-shot conformal baselines, demonstrating DANCE's superior blend of set size efficiency and robustness across various datasets.
Paper Structure (33 sections, 2 theorems, 27 equations, 5 figures, 5 tables, 1 algorithm)

This paper contains 33 sections, 2 theorems, 27 equations, 5 figures, 5 tables, 1 algorithm.

Key Result

Theorem 3.1

Assume that the calibration sample $\mathcal{D}_{\rm cal}$ together with test observation $(Z_{n+1},y_{n+1})$ is exchangeable. For the prediction set $\hat{C}_{knn}(Z_{n+1})$ defined in eq:knn_pred_C, we have coverage guarantee where the upper bound is established given there is no tie among the nonconformity scores and $q_{knn}\leq m_{knn}$.

Figures (5)

  • Figure 1: The DANCE pipeline. (a) The calibration set is used to compute nonconformity scores for a given test input. Our architectural contribution is the introduction of a task-adapted Recursive Feature Machines (RFM) kernel together with two complementary nonconformity scores. (b) The Neighbor Nonconformity module derives a rank cutoff $q_{\mathrm{knn}}$ in the kernel space from the calibration set. This cutoff is used to identify nearest labels and to construct the candidate label set $\hat{C}_{\mathrm{knn}}$ for the test input. (c) The Contrastive Nonconformity module defines a density cutoff $q_{\mathrm{clr}}$, below which a second candidate label set $\hat{C}_{\mathrm{clr}}$ is formed. The intersection of $\hat{C}_{\mathrm{knn}}$ and $\hat{C}_{\mathrm{clr}}$ yields the final conformal prediction set $\hat{C}_{\mathrm{dance}}$, yielding $1-\alpha$ coverage.
  • Figure 2: Comparison of embedding projections for original and RFM kernel-transformed space for the top 10 most frequent classes (represented by colors) of the Imagenet-R dataset. Note the per-class clustering of the transformed kernel space. (a) Original embeddings; (b) Kernel-transformed embeddings.
  • Figure 3: Ablation study of UQ based on $\lambda$. We observe the mix of $S_{knn}$ and $S_{clr}$ generally provides a trade-off between CCV and set size.
  • Figure 4: Additional ablation studies. Top Row: Impact of neighbor number for $m_{knn}$. Bottom Row: Impact of neighbor number for $m_{clr}$
  • Figure 5: RFM Training Set Size vs Training Time (Both axes are log-scaled.)

Theorems & Definitions (3)

  • Theorem 3.1
  • Proposition 3.2
  • proof