Table of Contents
Fetching ...

Volume-Sorted Prediction Set: Efficient Conformal Prediction for Multi-Target Regression

Rui Luo, Zhixin Zhou

TL;DR

The paper tackles uncertainty quantification in multi-target regression by introducing Volume-Sorted Prediction Set (VSPS), which leverages conditional normalizing flows to map $p_{Y|X}(y|x)$ to a known latent form $Z=f_\phi(Y,X)$ and identify dense regions via the Jacobian determinant $\left|\det\left(\frac{\partial f_\phi(y,x)}{\partial y}\right)\right|$. VSPS constructs prediction regions as unions of balls in the original space, prioritizing high-density areas and calibrating the radius $\gamma$ to guarantee $(1-\alpha)$ coverage through split conformal prediction, with an optimal number of centers $K^*$ chosen on a validation set. Theoretical analysis proves coverage guarantees under exchangeability, and extensive experiments on synthetic and real data show VSPS yields smaller, more informative regions while maintaining robust coverage compared to ST-DQR, NPDQR, and Naïve QR. The approach advances practical uncertainty quantification in high-dimensional, dependent multi-target settings by combining flexible region shapes with principled calibration. The method has potential impact for domains requiring reliable joint uncertainty estimates, such as pharmacology, environmental modeling, and finance, where non-convex, data-adaptive prediction sets improve decision-making under uncertainty.

Abstract

We introduce Volume-Sorted Prediction Set (VSPS), a novel method for uncertainty quantification in multi-target regression that uses conditional normalizing flows with conformal calibration. This approach constructs flexible, non-convex predictive regions with guaranteed coverage probabilities, overcoming limitations of traditional methods. By learning a transformation where the conditional distribution of responses follows a known form, VSPS identifies dense regions in the original space using the Jacobian determinant. This enables the creation of prediction regions that adapt to the true underlying distribution, focusing on areas of high probability density. Experimental results demonstrate that VSPS produces smaller, more informative prediction regions while maintaining robust coverage guarantees, enhancing uncertainty modeling in complex, high-dimensional settings.

Volume-Sorted Prediction Set: Efficient Conformal Prediction for Multi-Target Regression

TL;DR

The paper tackles uncertainty quantification in multi-target regression by introducing Volume-Sorted Prediction Set (VSPS), which leverages conditional normalizing flows to map to a known latent form and identify dense regions via the Jacobian determinant . VSPS constructs prediction regions as unions of balls in the original space, prioritizing high-density areas and calibrating the radius to guarantee coverage through split conformal prediction, with an optimal number of centers chosen on a validation set. Theoretical analysis proves coverage guarantees under exchangeability, and extensive experiments on synthetic and real data show VSPS yields smaller, more informative regions while maintaining robust coverage compared to ST-DQR, NPDQR, and Naïve QR. The approach advances practical uncertainty quantification in high-dimensional, dependent multi-target settings by combining flexible region shapes with principled calibration. The method has potential impact for domains requiring reliable joint uncertainty estimates, such as pharmacology, environmental modeling, and finance, where non-convex, data-adaptive prediction sets improve decision-making under uncertainty.

Abstract

We introduce Volume-Sorted Prediction Set (VSPS), a novel method for uncertainty quantification in multi-target regression that uses conditional normalizing flows with conformal calibration. This approach constructs flexible, non-convex predictive regions with guaranteed coverage probabilities, overcoming limitations of traditional methods. By learning a transformation where the conditional distribution of responses follows a known form, VSPS identifies dense regions in the original space using the Jacobian determinant. This enables the creation of prediction regions that adapt to the true underlying distribution, focusing on areas of high probability density. Experimental results demonstrate that VSPS produces smaller, more informative prediction regions while maintaining robust coverage guarantees, enhancing uncertainty modeling in complex, high-dimensional settings.

Paper Structure

This paper contains 19 sections, 2 theorems, 17 equations, 2 figures, 2 tables, 4 algorithms.

Key Result

Lemma 1

Assume that the random variables $\{ (X_i, Y_i) \}_{i=1}^{N}$ are exchangeable. Then, the transformed random variables $\{ (X_i, Z_i) \}_{i=1}^{N}$, where $Z_i = f_\phi(Y_i, X_i)$, are also exchangeable.

Figures (2)

  • Figure 1: Scheme of the Volume-Sorted Prediction Set (VSPS) method. The CNF maps the original distribution to a standard normal distribution. Samples are generated, sorted, and selected based on their absolute Jacobian determinant. Top $K$ samples are transformed back to form the prediction region as balls of radius $\gamma$ centered at the samples. $\gamma$ is calibrated to ensure the desired $1-\alpha$ coverage, whereas the optimal $K$ is determined using a validation set. See Section \ref{['sec:methodology']} for details.
  • Figure 2: Comparison of prediction regions for the Bio dataset using four methods: Naïve QR, NPDQR, ST-DQR, and our proposed VSPS. The dataset is partitioned into three clusters using K-Means. Circles represent covered points, while triangles indicate miscovered points. Each row corresponds to a different method, with VSPS demonstrating the most compact prediction regions across all clusters while maintaining the $1-\alpha$ coverage. The Bio dataset comprises two protein structural features ($Y_0$ and $Y_1$) as response variables and an 8-dimensional protein feature vector ($X$) as predictors.

Theorems & Definitions (5)

  • Definition 1: Exchangeability
  • Lemma 1: Exchangeability of Transformed Samples
  • proof
  • Theorem 1: Coverage Guarantee
  • proof : Proof of Theorem \ref{['theorem:coverage']}