ESA: Example Sieve Approach for Multi-Positive and Unlabeled Learning
Zhongnian Li, Meng Wei, Peng Ying, Xinzheng Xu
TL;DR
The paper tackles learning from multi-positive and unlabeled data where flexible models can cause a shifting of the minimum risk. It introduces the Example Sieve Approach (ESA), which sieves training examples by their Certain Loss ($CL$) to form sieved datasets and a biased, but consistent, SEA risk estimator for multi-class classification. The authors establish consistency and a generalization bound via Rademacher complexity, showing the estimation error decays optimally as data grow. Empirically, ESA outperforms state-of-the-art MPU/PU methods on MNIST, Kuzushi-MNIST, and CIFAR datasets and exhibits robustness to class-prior mis-specification and distribution shifts, highlighting practical impact for weakly supervised multi-class learning.
Abstract
Learning from Multi-Positive and Unlabeled (MPU) data has gradually attracted significant attention from practical applications. Unfortunately, the risk of MPU also suffer from the shift of minimum risk, particularly when the models are very flexible as shown in Fig.\ref{moti}. In this paper, to alleviate the shifting of minimum risk problem, we propose an Example Sieve Approach (ESA) to select examples for training a multi-class classifier. Specifically, we sieve out some examples by utilizing the Certain Loss (CL) value of each example in the training stage and analyze the consistency of the proposed risk estimator. Besides, we show that the estimation error of proposed ESA obtains the optimal parametric convergence rate. Extensive experiments on various real-world datasets show the proposed approach outperforms previous methods.
