Table of Contents
Fetching ...

Conformal Prediction in Hierarchical Classification with Constrained Representation Complexity

Thomas Mortier, Alireza Javanmardi, Yusuf Sale, Eyke Hüllermeier, Willem Waegeman

TL;DR

This work addresses producing prediction sets with nominal coverage in hierarchical classification while maintaining interpretability. It extends split conformal prediction by introducing CRSVP with $R_{\mathcal{T}}(\hat{Y})=1$ and CRSVP-$r$ with $R_{\mathcal{T}}(\hat{Y})\le r$, each accompanied by calibration and inference procedures that provide distribution-free finite-sample guarantees. Empirical results across diverse benchmarks show that allowing representation complexity yields more efficient, semantically meaningful predictions, with randomized prediction sets ensuring exact coverage. The findings suggest that representation complexity can act as a regularizer in challenging probability landscapes and point to future work extending the framework to more complex structures like directed acyclic graphs.

Abstract

Conformal prediction has emerged as a widely used framework for constructing valid prediction sets in classification and regression tasks. In this work, we extend the split conformal prediction framework to hierarchical classification, where prediction sets are commonly restricted to internal nodes of a predefined hierarchy, and propose two computationally efficient inference algorithms. The first algorithm returns internal nodes as prediction sets, while the second one relaxes this restriction. Using the notion of representation complexity, the latter yields smaller set sizes at the cost of a more general and combinatorial inference problem. Empirical evaluations on several benchmark datasets demonstrate the effectiveness of the proposed algorithms in achieving nominal coverage.

Conformal Prediction in Hierarchical Classification with Constrained Representation Complexity

TL;DR

This work addresses producing prediction sets with nominal coverage in hierarchical classification while maintaining interpretability. It extends split conformal prediction by introducing CRSVP with and CRSVP- with , each accompanied by calibration and inference procedures that provide distribution-free finite-sample guarantees. Empirical results across diverse benchmarks show that allowing representation complexity yields more efficient, semantically meaningful predictions, with randomized prediction sets ensuring exact coverage. The findings suggest that representation complexity can act as a regularizer in challenging probability landscapes and point to future work extending the framework to more complex structures like directed acyclic graphs.

Abstract

Conformal prediction has emerged as a widely used framework for constructing valid prediction sets in classification and regression tasks. In this work, we extend the split conformal prediction framework to hierarchical classification, where prediction sets are commonly restricted to internal nodes of a predefined hierarchy, and propose two computationally efficient inference algorithms. The first algorithm returns internal nodes as prediction sets, while the second one relaxes this restriction. Using the notion of representation complexity, the latter yields smaller set sizes at the cost of a more general and combinatorial inference problem. Empirical evaluations on several benchmark datasets demonstrate the effectiveness of the proposed algorithms in achieving nominal coverage.

Paper Structure

This paper contains 13 sections, 1 theorem, 11 equations, 6 figures, 4 tables, 5 algorithms.

Key Result

Theorem 2.1

Assume an exchangeable sequence $\{(\boldsymbol{x}_{i},y_{i},u_{i})\}_{i=1}^{N+1}$ and let $\hat{Y}(\boldsymbol{x},u,\tau)$ be a set-valued predictor that satisfies (eq:nested). Furthermore, assume that $\exists \tau \in \mathbb{R}: \hat{Y}(\boldsymbol{x},u,\tau)=\mathcal{Y}$. Then, for $\tau^{*}$ i

Figures (6)

  • Figure 1: A sample image of the Lotus corniculatus species from the PlantCLEF 2015 dataset goeau15lifeclef.
  • Figure 2: An example of a tree structure $\mathcal{T}$ with class space $\mathcal{Y}=\{1,\ldots,8\}$ and nodes $\mathcal{V}_{\mathcal{T}}=\{v_{1},\ldots,v_{15}\}$. The root $v_{1}$ represents the class space $\mathcal{Y}$ and leaves $\{v_{8},\ldots,v_{15}\}$ represent the individual classes. The numbers in the leaf nodes represent the class probabilities for an instance $\boldsymbol{x}$.
  • Figure 3: Trade-off between representation complexity (log scale) and efficiency for PlantCLEF 2015. The confidence level is set to 90%, and calibration and test sets are resampled 10 times.
  • Figure : CRSVP calibration -- Input:$\{(\boldsymbol{x}_{i},y_{i},u_{i})\}_{i=1}^{N}, \hat{P}, \mathcal{V}_{\mathcal{T}}$, Output: Threshold in (\ref{['eq:nested:tau']}).
  • Figure : CRSVP-$r$ calibration -- Input:$\{(\boldsymbol{x}_{i},y_{i},u_{i})\}_{i=1}^{N}, r, \hat{P}, \mathcal{V}_{\mathcal{T}}$, Output: Threshold in (\ref{['eq:nested:tau']}).
  • ...and 1 more figures

Theorems & Definitions (1)

  • Theorem 2.1: Marginal validity of nested conformal prediction angelopoulos20raps