PUAL: A Classifier on Trifurcate Positive-Unlabeled Data
Xiaoke Wang, Xiaochen Yang, Rui Zhu, Jing-Hao Xue
TL;DR
This work tackles PU learning under trifurcate data where two positive subgroups lie on opposite sides of negatives. It introduces PUAL, a classifier with asymmetric loss on the labeled-positive and unlabeled data, further extended to non-linear boundaries via kernelization and ADMM-based optimization. The method demonstrates improved boundary fidelity and higher F1 performance over GLLC and other PU approaches on both synthetic trifurcate-like data and 16 real-world UCI datasets, with the kernelized version handling complex separations. The results suggest PUAL’s practical value for PU tasks with multi-modal positive distributions, while future work may leverage L1 regularization to enhance sparsity and feature selection.
Abstract
Positive-unlabeled (PU) learning aims to train a classifier using the data containing only labeled-positive instances and unlabeled instances. However, existing PU learning methods are generally hard to achieve satisfactory performance on trifurcate data, where the positive instances distribute on both sides of the negative instances. To address this issue, firstly we propose a PU classifier with asymmetric loss (PUAL), by introducing a structure of asymmetric loss on positive instances into the objective function of the global and local learning classifier. Then we develop a kernel-based algorithm to enable PUAL to obtain non-linear decision boundary. We show that, through experiments on both simulated and real-world datasets, PUAL can achieve satisfactory classification on trifurcate data.
