Table of Contents
Fetching ...

Partial-Label Learning with a Reject Option

Tobias Fuchs, Florian Kalinke, Klemens Böhm

TL;DR

This work tackles partial-label learning with ambiguously labeled data and the risk of misclassification in safety-critical settings. It introduces Dst-Pll, a nearest-neighbor PLL method that maintains credal sets via Dempster-Shafer theory and uses Yager's rule to fuse evidence from neighbors. A novel adaptive reject option based on belief and plausibility decides accept vs reject, achieving improved trade-offs between the number and accuracy of non-rejected predictions and proving risk consistency. Empirical results on synthetic and real-world datasets show competitive predictive performance and superior rejection behavior across varying noise conditions, with runtime dominated by nearest-neighbor search. Code and data are released to support reproducibility.

Abstract

In real-world applications, one often encounters ambiguously labeled data, where different annotators assign conflicting class labels. Partial-label learning allows training classifiers in this weakly supervised setting, where state-of-the-art methods already show good predictive performance. However, even the best algorithms give incorrect predictions, which can have severe consequences when they impact actions or decisions. We propose a novel risk-consistent nearest-neighbor-based partial-label learning algorithm with a reject option, that is, the algorithm can reject unsure predictions. Extensive experiments on artificial and real-world datasets show that our method provides the best trade-off between the number and accuracy of non-rejected predictions when compared to our competitors, which use confidence thresholds for rejecting unsure predictions. When evaluated without the reject option, our nearest-neighbor-based approach also achieves competitive prediction performance.

Partial-Label Learning with a Reject Option

TL;DR

This work tackles partial-label learning with ambiguously labeled data and the risk of misclassification in safety-critical settings. It introduces Dst-Pll, a nearest-neighbor PLL method that maintains credal sets via Dempster-Shafer theory and uses Yager's rule to fuse evidence from neighbors. A novel adaptive reject option based on belief and plausibility decides accept vs reject, achieving improved trade-offs between the number and accuracy of non-rejected predictions and proving risk consistency. Empirical results on synthetic and real-world datasets show competitive predictive performance and superior rejection behavior across varying noise conditions, with runtime dominated by nearest-neighbor search. Code and data are released to support reproducibility.

Abstract

In real-world applications, one often encounters ambiguously labeled data, where different annotators assign conflicting class labels. Partial-label learning allows training classifiers in this weakly supervised setting, where state-of-the-art methods already show good predictive performance. However, even the best algorithms give incorrect predictions, which can have severe consequences when they impact actions or decisions. We propose a novel risk-consistent nearest-neighbor-based partial-label learning algorithm with a reject option, that is, the algorithm can reject unsure predictions. Extensive experiments on artificial and real-world datasets show that our method provides the best trade-off between the number and accuracy of non-rejected predictions when compared to our competitors, which use confidence thresholds for rejecting unsure predictions. When evaluated without the reject option, our nearest-neighbor-based approach also achieves competitive prediction performance.
Paper Structure (30 sections, 8 theorems, 27 equations, 17 figures, 3 tables, 1 algorithm)

This paper contains 30 sections, 8 theorems, 27 equations, 17 figures, 3 tables, 1 algorithm.

Key Result

Theorem 4.2

Let $\mathop{\mathrm{\mathcal{Y}}}\nolimits$ be the label space, $\tilde{x} \in \mathop{\mathrm{\mathcal{X}}}\nolimits$ the instance of interest, $\tilde{s} \subseteq \mathop{\mathrm{\mathcal{Y}}}\nolimits$ its candidate labels ($\tilde{s} = \mathop{\mathrm{\mathcal{Y}}}\nolimits$ if $\tilde{x}$ is

Figures (17)

  • Figure 1: Trade-off between the fraction of rejected predictions and the accuracy of non-rejected predictions for three experiments: Ecoli with instance-dependent noise, KMNIST with instance-dependent noise, and the real-world dataset msrc-v2. We show the trade-off curves for varying confidence (0 to 1) and $\Delta_{\mathop{\mathrm{\tilde{m}}}\nolimits}$ (-1 to 1) thresholds. We highlight the points corresponding to a threshold of $\Delta_{\mathop{\mathrm{\tilde{m}}}\nolimits} = 0$ for our method, a confidence threshold of 90% for methods with a probability output, and a threshold of 50% of all votes for Pl-Knn. We refer to Appendix \ref{['sec:app-tradeoff-curves']} for all reject trade-off curves across all experimental settings.
  • Figure 2: Comparison of simulating and calculating the expected belief of the true label $\tilde{y}$ and of the most frequently co-occurring label $\tilde{y}_{\operatorname{c}}$ for $k \in \left[100\right]$, $l = 3$, $p_1 = 0.4$, $p_2 = 0.35$, and $p_3 = 0.25$.
  • Figure 3: Counts of the class labels in the candidate sets $s_i$ of instance $\tilde{x}$'s 10-nearest neighbors $x_i$.
  • Figure 4: Sensitivity of $k$ regarding the test-set MCC score, the fraction, and the MCC score of confident / non-rejected predictions.
  • Figure 5: Trade-off between the fraction of rejected predictions and the accuracy of non-rejected predictions.
  • ...and 12 more figures

Theorems & Definitions (12)

  • Example 4.1: Classification rule
  • Theorem 4.2
  • Example 4.3: Reject option
  • Lemma 4.5
  • Theorem 4.6
  • Lemma B.1
  • proof
  • Definition C.1
  • Theorem C.2
  • Theorem C.3
  • ...and 2 more