Applications of Positive Unlabeled (PU) and Negative Unlabeled (NU) Learning in Cybersecurity
Robert Dilworth, Charan Gudla
TL;DR
The paper argues that Positive Unlabeled (PU) and Negative Unlabeled (NU) learning offer principled semi-supervised approaches for cybersecurity tasks characterized by scarce labels and abundant unlabeled data. It surveys applications across subfields (intrusion detection, vulnerability management, malware detection, threat intelligence, and more), provides formal problem formulations, and discusses practical challenges such as scalability, class imbalance, and evolving threats. Through literature synthesis, it identifies gaps in real-time deployment, domain-knowledge integration, and generalization across cybersecurity domains, and proposes directions like meta-learning and density-ratio estimation to advance PU/NU methods. The work aims to catalyze integration of PU/NU learning into cyber defense workflows to improve detection, response, and resilience against emergent threats.
Abstract
This paper explores the relatively underexplored application of Positive Unlabeled (PU) Learning and Negative Unlabeled (NU) Learning in the cybersecurity domain. While these semi-supervised learning methods have been applied successfully in fields like medicine and marketing, their potential in cybersecurity remains largely untapped. The paper identifies key areas of cybersecurity--such as intrusion detection, vulnerability management, malware detection, and threat intelligence--where PU/NU learning can offer significant improvements, particularly in scenarios with imbalanced or limited labeled data. We provide a detailed problem formulation for each subfield, supported by mathematical reasoning, and highlight the specific challenges and research gaps in scaling these methods to real-time systems, addressing class imbalance, and adapting to evolving threats. Finally, we propose future directions to advance the integration of PU/NU learning in cybersecurity, offering solutions that can better detect, manage, and mitigate emerging cyber threats.
