Accessible, Realistic, and Fair Evaluation of Positive-Unlabeled Learning Algorithms

Wei Wang; Dong-Dong Wu; Ming Li; Jingxiong Zhang; Gang Niu; Masashi Sugiyama

Accessible, Realistic, and Fair Evaluation of Positive-Unlabeled Learning Algorithms

Wei Wang, Dong-Dong Wu, Ming Li, Jingxiong Zhang, Gang Niu, Masashi Sugiyama

TL;DR

The first PU learning benchmark to systematically compare PU learning algorithms is proposed and the internal label shift problem of unlabeled training data for the one-sample setting is identified and a simple yet effective calibration approach is proposed to ensure fair comparisons within and across families.

Abstract

Positive-unlabeled (PU) learning is a weakly supervised binary classification problem, in which the goal is to learn a binary classifier from only positive and unlabeled data, without access to negative data. In recent years, many PU learning algorithms have been developed to improve model performance. However, experimental settings are highly inconsistent, making it difficult to identify which algorithm performs better. In this paper, we propose the first PU learning benchmark to systematically compare PU learning algorithms. During our implementation, we identify subtle yet critical factors that affect the realistic and fair evaluation of PU learning algorithms. On the one hand, many PU learning algorithms rely on a validation set that includes negative data for model selection. This is unrealistic in traditional PU learning settings, where no negative data are available. To handle this problem, we systematically investigate model selection criteria for PU learning. On the other hand, PU learning involves different problem settings and corresponding solution families, i.e., the one-sample and two-sample settings. However, existing evaluation protocols are heavily biased towards the one-sample setting and neglect the significant difference between them. We identify the internal label shift problem of unlabeled training data for the one-sample setting and propose a simple yet effective calibration approach to ensure fair comparisons within and across families. We hope our framework will provide an accessible, realistic, and fair environment for evaluating PU learning algorithms in the future.

Accessible, Realistic, and Fair Evaluation of Positive-Unlabeled Learning Algorithms

TL;DR

Abstract

Accessible, Realistic, and Fair Evaluation of Positive-Unlabeled Learning Algorithms

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (8)

Theorems & Definitions (15)