Partial-Label Learning with a Reject Option

Tobias Fuchs; Florian Kalinke; Klemens Böhm

Partial-Label Learning with a Reject Option

Tobias Fuchs, Florian Kalinke, Klemens Böhm

TL;DR

This work tackles partial-label learning with ambiguously labeled data and the risk of misclassification in safety-critical settings. It introduces Dst-Pll, a nearest-neighbor PLL method that maintains credal sets via Dempster-Shafer theory and uses Yager's rule to fuse evidence from neighbors. A novel adaptive reject option based on belief and plausibility decides accept vs reject, achieving improved trade-offs between the number and accuracy of non-rejected predictions and proving risk consistency. Empirical results on synthetic and real-world datasets show competitive predictive performance and superior rejection behavior across varying noise conditions, with runtime dominated by nearest-neighbor search. Code and data are released to support reproducibility.

Abstract

In real-world applications, one often encounters ambiguously labeled data, where different annotators assign conflicting class labels. Partial-label learning allows training classifiers in this weakly supervised setting, where state-of-the-art methods already show good predictive performance. However, even the best algorithms give incorrect predictions, which can have severe consequences when they impact actions or decisions. We propose a novel risk-consistent nearest-neighbor-based partial-label learning algorithm with a reject option, that is, the algorithm can reject unsure predictions. Extensive experiments on artificial and real-world datasets show that our method provides the best trade-off between the number and accuracy of non-rejected predictions when compared to our competitors, which use confidence thresholds for rejecting unsure predictions. When evaluated without the reject option, our nearest-neighbor-based approach also achieves competitive prediction performance.

Partial-Label Learning with a Reject Option

TL;DR

Abstract

Paper Structure (30 sections, 8 theorems, 27 equations, 17 figures, 3 tables, 1 algorithm)

This paper contains 30 sections, 8 theorems, 27 equations, 17 figures, 3 tables, 1 algorithm.

Introduction
Related Work
Partial-Label Learning
Reject Options
Problem Statement and Notations
Partial-Label Learning (PLL) with Reject Option
Dempster-Shafer Theory (DST)
Our Method: DST-PLL
Making Predictions
Proposed Reject Option
Consistency
Runtime Complexity
Experiments
Algorithms for Comparison
Experimental Setup
...and 15 more sections

Key Result

Theorem 4.2

Let $\mathop{\mathrm{\mathcal{Y}}}\nolimits$ be the label space, $\tilde{x} \in \mathop{\mathrm{\mathcal{X}}}\nolimits$ the instance of interest, $\tilde{s} \subseteq \mathop{\mathrm{\mathcal{Y}}}\nolimits$ its candidate labels ($\tilde{s} = \mathop{\mathrm{\mathcal{Y}}}\nolimits$ if $\tilde{x}$ is

Figures (17)

Figure 1: Trade-off between the fraction of rejected predictions and the accuracy of non-rejected predictions for three experiments: Ecoli with instance-dependent noise, KMNIST with instance-dependent noise, and the real-world dataset msrc-v2. We show the trade-off curves for varying confidence (0 to 1) and $\Delta_{\mathop{\mathrm{\tilde{m}}}\nolimits}$ (-1 to 1) thresholds. We highlight the points corresponding to a threshold of $\Delta_{\mathop{\mathrm{\tilde{m}}}\nolimits} = 0$ for our method, a confidence threshold of 90% for methods with a probability output, and a threshold of 50% of all votes for Pl-Knn. We refer to Appendix \ref{['sec:app-tradeoff-curves']} for all reject trade-off curves across all experimental settings.
Figure 2: Comparison of simulating and calculating the expected belief of the true label $\tilde{y}$ and of the most frequently co-occurring label $\tilde{y}_{\operatorname{c}}$ for $k \in \left[100\right]$, $l = 3$, $p_1 = 0.4$, $p_2 = 0.35$, and $p_3 = 0.25$.
Figure 3: Counts of the class labels in the candidate sets $s_i$ of instance $\tilde{x}$'s 10-nearest neighbors $x_i$.
Figure 4: Sensitivity of $k$ regarding the test-set MCC score, the fraction, and the MCC score of confident / non-rejected predictions.
Figure 5: Trade-off between the fraction of rejected predictions and the accuracy of non-rejected predictions.
...and 12 more figures

Theorems & Definitions (12)

Example 4.1: Classification rule
Theorem 4.2
Example 4.3: Reject option
Lemma 4.5
Theorem 4.6
Lemma B.1
proof
Definition C.1
Theorem C.2
Theorem C.3
...and 2 more

Partial-Label Learning with a Reject Option

TL;DR

Abstract

Partial-Label Learning with a Reject Option

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (17)

Theorems & Definitions (12)