Table of Contents
Fetching ...

Conformal Correction for Efficiency May be at Odds with Entropy

Senrong Xu, Tianyu Wang, Zenan Li, Yuan Yao, Taolue Chen, Feng Xu, Xiaoxing Ma

TL;DR

This work investigates how conformal prediction efficiency (smaller uncertainty sets) can conflict with predictive entropy. It reveals a theoretical and empirical trade-off between efficiency and entropy, and introduces EC$^\text{3}$, an entropy-aware conformal correction method that uses a focal-like loss with a negative entropy term and temperature scaling to navigate the Pareto frontier. EC$^\text{3}$ demonstrates substantial efficiency gains at fixed entropy and improves conditional coverage on computer vision and graph tasks, with further extensions to conditional conformal coverage. The results highlight the practical value of entropy control in CP and provide a framework for achieving better-calibrated uncertainty sets across diverse domains.

Abstract

Conformal prediction (CP) provides a comprehensive framework to produce statistically rigorous uncertainty sets for black-box machine learning models. To further improve the efficiency of CP, conformal correction is proposed to fine-tune or wrap the base model with an extra module using a conformal-aware inefficiency loss. In this work, we empirically and theoretically identify a trade-off between the CP efficiency and the entropy of model prediction. We then propose an entropy-constrained conformal correction method, exploring a better Pareto optimum between efficiency and entropy. Extensive experimental results on both computer vision and graph datasets demonstrate the efficacy of the proposed method. For instance, it can significantly improve the efficiency of state-of-the-art CP methods by up to 34.4%, given an entropy threshold.

Conformal Correction for Efficiency May be at Odds with Entropy

TL;DR

This work investigates how conformal prediction efficiency (smaller uncertainty sets) can conflict with predictive entropy. It reveals a theoretical and empirical trade-off between efficiency and entropy, and introduces EC, an entropy-aware conformal correction method that uses a focal-like loss with a negative entropy term and temperature scaling to navigate the Pareto frontier. EC demonstrates substantial efficiency gains at fixed entropy and improves conditional coverage on computer vision and graph tasks, with further extensions to conditional conformal coverage. The results highlight the practical value of entropy control in CP and provide a framework for achieving better-calibrated uncertainty sets across diverse domains.

Abstract

Conformal prediction (CP) provides a comprehensive framework to produce statistically rigorous uncertainty sets for black-box machine learning models. To further improve the efficiency of CP, conformal correction is proposed to fine-tune or wrap the base model with an extra module using a conformal-aware inefficiency loss. In this work, we empirically and theoretically identify a trade-off between the CP efficiency and the entropy of model prediction. We then propose an entropy-constrained conformal correction method, exploring a better Pareto optimum between efficiency and entropy. Extensive experimental results on both computer vision and graph datasets demonstrate the efficacy of the proposed method. For instance, it can significantly improve the efficiency of state-of-the-art CP methods by up to 34.4%, given an entropy threshold.

Paper Structure

This paper contains 25 sections, 3 theorems, 36 equations, 7 figures, 11 tables.

Key Result

Proposition 1

For a given sample point $x$ and the corresponding predictive distribution $\hat{\pi}(x)$, the average non-conformity score is upper-bounded by the prediction entropy. Namely, with constant $C_K := \log (\sum_{k=1}^K\exp(-\frac{k-1}{K}) )$.

Figures (7)

  • Figure 1: Fig.(a) plots the Pareto frontier between inefficiency and entropy. For both of them, the lower, the better; Fig.(b) and (c) are the results of training only with $\mathcal{L}_{\mathrm{class}}$ on CIFAR100 and Cora-ML, respectively. (b) and (c) depict the efficiency and entropy on the test set during the conformal correction, and there is a trade-off between them when the accuracy reaches the top.
  • Figure 2: Fig. (a) is an illustration of Proposition \ref{['thm:lemma1']} when $K=2$, which demonstrates that the tight upper-bound of $\bar{V}(\hat{\pi}(x))$ consists of two pieces; Fig. (b) and (c) are efficiency and entropy curves of APS (cf. Section \ref{['sec:prel']}) on the test set after temperature scaling w.r.t. $T$ when $\alpha=0.1$. The efficiency of APS is at odds with the entropy of model prediction in most cases.
  • Figure 3: Pareto optima of different conformal correction methods. Compared with baselines, the proposed EC$^\text{3}$ obtains the best Pareto frontier via achieving a better balance between efficiency and entropy on both CIFAR10 and Cora-ML.
  • Figure 4: Class conditional coverage results on CIFAR10 (left) and Cora-ML (right). The class coverage below 0.9 is in shadow. Our method EC$^\text{3}$(Cond) increases most of the class coverages below 0.9.
  • Figure 5: Efficiency and entropy results of APS on the test set after temperature scaling w.r.t. $T$ when $\alpha=0.1$.
  • ...and 2 more figures

Theorems & Definitions (7)

  • Proposition 1
  • Proposition 2
  • Theorem 3
  • Remark 4
  • Proof 1
  • Proof 2
  • Proof 3