Table of Contents
Fetching ...

Conformal changepoint localization

Rohan Hore, Aaditya Ramdas

TL;DR

By proving a conformal Neyman-Pearson lemma, this work establishes a universality result showing that any distribution-free changepoint localization method must be an instance of CONCH, and suggests that CONCH delivers precise confidence sets even in challenging settings involving images or text.

Abstract

We study the problem of offline changepoint localization in a distribution-free setting. One observes a vector of data with a single changepoint, assuming that the data before and after the changepoint are iid (or more generally exchangeable) from arbitrary and unknown distributions. The goal is to produce a finite-sample confidence set for the index at which the change occurs without making any other assumptions. Existing methods often rely on parametric assumptions, tail conditions, or asymptotic approximations, or only produce point estimates. In contrast, our distribution-free algorithm, CONformal CHangepoint localization (CONCH), only leverages exchangeability arguments to construct confidence sets with finite sample coverage. By proving a conformal Neyman-Pearson lemma, we derive principled score functions that yield informative (small) sets. Moreover, with such score functions, the normalized length of the confidence set shrinks to zero under weak assumptions. We also establish a universality result showing that any distribution-free changepoint localization method must be an instance of CONCH. Experiments suggest that CONCH delivers precise confidence sets even in challenging settings involving images or text.

Conformal changepoint localization

TL;DR

By proving a conformal Neyman-Pearson lemma, this work establishes a universality result showing that any distribution-free changepoint localization method must be an instance of CONCH, and suggests that CONCH delivers precise confidence sets even in challenging settings involving images or text.

Abstract

We study the problem of offline changepoint localization in a distribution-free setting. One observes a vector of data with a single changepoint, assuming that the data before and after the changepoint are iid (or more generally exchangeable) from arbitrary and unknown distributions. The goal is to produce a finite-sample confidence set for the index at which the change occurs without making any other assumptions. Existing methods often rely on parametric assumptions, tail conditions, or asymptotic approximations, or only produce point estimates. In contrast, our distribution-free algorithm, CONformal CHangepoint localization (CONCH), only leverages exchangeability arguments to construct confidence sets with finite sample coverage. By proving a conformal Neyman-Pearson lemma, we derive principled score functions that yield informative (small) sets. Moreover, with such score functions, the normalized length of the confidence set shrinks to zero under weak assumptions. We also establish a universality result showing that any distribution-free changepoint localization method must be an instance of CONCH. Experiments suggest that CONCH delivers precise confidence sets even in challenging settings involving images or text.
Paper Structure (62 sections, 18 theorems, 182 equations, 12 figures, 5 algorithms)

This paper contains 62 sections, 18 theorems, 182 equations, 12 figures, 5 algorithms.

Key Result

Theorem 3.1

For each $t \in [n]$, $p_t$ in eq:pvalue_conch is a valid $p$-value under $\mathcal{H}_{0,t}$, i.e., for any $\alpha \in (0,1)$, $\mathbb{P}_{\xi}\left(p_{\xi}\leq \alpha\right)\leq \alpha$. Consequently, $\mathcal{C}^{\mathrm{CONCH}}_{1-\alpha}$ is a distribution-free confidence set for changepoint

Figures (12)

  • Figure 1: Distribution of conformal $p$-values (Gaussian mean-shift) for different methods.
  • Figure 2: Refinement of bootstrap-based confidence sets using CONCH-CAL under Gaussian and Laplace mean-shift models.
  • Figure 3: Illustration of the DomainNet changepoint setup: samples switch from the real to the sketch domain at $\xi = 350$ ($n = 800$). Images are drawn from the DomainNet dataset, which was collected via online search; class labels may not perfectly align with visual semantics, making the domain-shift detection problem more challenging.
  • Figure 4: p-values for domain shift detection between real and sketch domains: classifier scores (left) and CONCH$p$-values (right)
  • Figure 5: CONCH p-values for sentiment shift in SST-2: from positive to negative reviews at $\xi=400$ (left), and from 60% positive to 40% positive (right).
  • ...and 7 more figures

Theorems & Definitions (31)

  • Definition 1
  • Theorem 3.1
  • Remark 3.1: Monte-Carlo $p$-values
  • Remark 3.2: Exact validity
  • Remark 3.3: Time-reversal symmetry
  • Proposition 4.1
  • Lemma 4.2: second Conformal NP lemma
  • Theorem 4.3
  • Theorem 5.1: Sharpness with oracle LLR score
  • Theorem 5.2: Asymptotic sharpness
  • ...and 21 more