Semiparametric conformal prediction
Ji Won Park, Robert Tibshirani, Kyunghyun Cho
TL;DR
The paper tackles the challenge of constructing valid confidence sets for multi-target regression by modeling the joint distribution of vector non-conformity scores with nonparametric vine copulas and applying a semiparametric one-step correction to the $1-\alpha$ quantile. This yields prediction sets with asymptotically exact coverage and robustness to missing-at-random labels, while maintaining competitive efficiency. The approach integrates advanced copula-based density estimation with efficient influence-function theory to debias the target quantile, and provides both theoretical guarantees and empirical demonstrations on synthetic and real datasets. The proposed framework is versatile, scalable to high-dimensional targets, and readily applicable to diverse risk-sensitive applications where correlated prediction errors matter.
Abstract
Many risk-sensitive applications require well-calibrated prediction sets over multiple, potentially correlated target variables, for which the prediction algorithm may report correlated errors. In this work, we aim to construct the conformal prediction set accounting for the joint correlation structure of the vector-valued non-conformity scores. Drawing from the rich literature on multivariate quantiles and semiparametric statistics, we propose an algorithm to estimate the $1-α$ quantile of the scores, where $α$ is the user-specified miscoverage rate. In particular, we flexibly estimate the joint cumulative distribution function (CDF) of the scores using nonparametric vine copulas and improve the asymptotic efficiency of the quantile estimate using its influence function. The vine decomposition allows our method to scale well to a large number of targets. As well as guaranteeing asymptotically exact coverage, our method yields desired coverage and competitive efficiency on a range of real-world regression problems, including those with missing-at-random labels in the calibration set.
