Table of Contents
Fetching ...

Exact and Approximate Conformal Inference for Multi-Output Regression

Chancellor Johnstone, Eugene Ndiaye

TL;DR

Multi-output regression is explored, delivering exact derivations of conformal inference $p-values when the predictive model can be described as a linear function of $y$ when the predictive model can be described as a linear function of $y$.

Abstract

It is common in machine learning to estimate a response $y$ given covariate information $x$. However, these predictions alone do not quantify any uncertainty associated with said predictions. One way to overcome this deficiency is with conformal inference methods, which construct a set containing the unobserved response $y$ with a prescribed probability. Unfortunately, even with a one-dimensional response, conformal inference is computationally expensive despite recent encouraging advances. In this paper, we explore multi-output regression, delivering exact derivations of conformal inference $p$-values when the predictive model can be described as a linear function of $y$. Additionally, we propose \texttt{unionCP} and a multivariate extension of \texttt{rootCP} as efficient ways of approximating the conformal prediction region for a wide array of multi-output predictors, both linear and nonlinear, while preserving computational advantages. We also provide both theoretical and empirical evidence of the effectiveness of these methods using both real-world and simulated data.

Exact and Approximate Conformal Inference for Multi-Output Regression

TL;DR

Multi-output regression is explored, delivering exact derivations of conformal inference yy$.

Abstract

It is common in machine learning to estimate a response given covariate information . However, these predictions alone do not quantify any uncertainty associated with said predictions. One way to overcome this deficiency is with conformal inference methods, which construct a set containing the unobserved response with a prescribed probability. Unfortunately, even with a one-dimensional response, conformal inference is computationally expensive despite recent encouraging advances. In this paper, we explore multi-output regression, delivering exact derivations of conformal inference -values when the predictive model can be described as a linear function of . Additionally, we propose \texttt{unionCP} and a multivariate extension of \texttt{rootCP} as efficient ways of approximating the conformal prediction region for a wide array of multi-output predictors, both linear and nonlinear, while preserving computational advantages. We also provide both theoretical and empirical evidence of the effectiveness of these methods using both real-world and simulated data.
Paper Structure (28 sections, 4 theorems, 26 equations, 6 figures, 2 tables)

This paper contains 28 sections, 4 theorems, 26 equations, 6 figures, 2 tables.

Key Result

Proposition 1

Assume the fitted model as in eqn:colspace, where $H(x_{n+1}, x_i) = H$. Then, if we define $y(z) = (y^\top,z)^\top$, we can describe the vector of residuals associated with the augmented dataset and some candidate value $z$ as $\hat{y}(z) - Hy(z) = A-Bz$ where $A = (I - H)y(0)$ and $B = (I - H)(0,\

Figures (6)

  • Figure 1: Comparing gridCP contours (left) to $p$-value change-point sets (middle) constructed using $||\cdot||^2_W$ with $W = \hat{\Sigma}^{-1}$ for an observation from cement dataset. We also include a comparison of a gridCP prediction set (shown with the black line) and unionCP set (right) for $\alpha = 0.25$.
  • Figure 2: Example of Algorithm 1 for constructing $\mathcal{E}_i$ for an observation from cement dataset (left). The "$\bullet$" identifies $\tilde{z}$, while the grey line represents the border of the $p$-value change-point set. Corner points are identified with "$\bullet$". The axes generated with $\tilde{z}$ are shown with the dotted black lines. We also include the collection of $p$-value change-point sets (right).
  • Figure 3: Illustration of the approximated conformal prediction set obtained fitting ellipse and convex hull given boundary points obtained by rootCP. We use scikit-learn make_regression to generate synthetic dataset with the parameters n_samples = $15$, n_features = $5$, n_targets = $2$ is the dimension of in output $y_{n+1}$. We selected $80\%$ of informative features and $60\%$ for effective rank (described as the approximate number of singular vectors required to explain most of the input data by linear combinations) and the standard deviation of the random noise is set to $5$.
  • Figure 4: Illustration of the approximated $\mathcal{E}_i$ with $30$ search directions with conformity measures defined with $\ell_p$ norms. Solid black lines denote convex hull approximations of each $\mathcal{E}_i$ using calculated boundary points.
  • Figure 5: Comparison of empirical coverage with random selection of $k$ regions and unionCP using ridge regression (RR), local-constant regression (LC) and local-linear regression (LL) across 10 repetitions.
  • ...and 1 more figures

Theorems & Definitions (5)

  • Proposition 1
  • Proposition 2
  • Proposition 3
  • Proposition 4
  • Definition 1: unionCP