Does Negative Sampling Matter? A Review with Insights into its Theory and Applications
Zhen Yang, Ming Ding, Tinglin Huang, Yukuo Cen, Junshuai Song, Bin Xu, Yuxiao Dong, Jie Tang
TL;DR
This survey synthesizes negative sampling (NS) as a unifying technique across domains, formalizing a general NS framework and tracing its five evolutionary lines. It categorizes negatives by how candidates are constructed (global, local, mini-batch, hop, memory-based) and by sampling strategy (static, hard, GAN-based, auxiliary-based, in-batch), detailing their pros, cons, and domain-specific adaptations. The paper surveys NS applications in recommender systems, graph representation learning, knowledge graphs, NLP, and computer vision, highlighting practical gains, trade-offs, and robust design principles. It also discusses open problems—such as the trade-off between negative quantity and quality, false negatives, and the potential of non-sampling or negative-free approaches—and outlines future directions for adaptive, robust, and scalable NS methods.
Abstract
Negative sampling has swiftly risen to prominence as a focal point of research, with wide-ranging applications spanning machine learning, computer vision, natural language processing, data mining, and recommender systems. This growing interest raises several critical questions: Does negative sampling really matter? Is there a general framework that can incorporate all existing negative sampling methods? In what fields is it applied? Addressing these questions, we propose a general framework that leverages negative sampling. Delving into the history of negative sampling, we trace the development of negative sampling through five evolutionary paths. We dissect and categorize the strategies used to select negative sample candidates, detailing global, local, mini-batch, hop, and memory-based approaches. Our review categorizes current negative sampling methods into five types: static, hard, GAN-based, Auxiliary-based, and In-batch methods, providing a clear structure for understanding negative sampling. Beyond detailed categorization, we highlight the application of negative sampling in various areas, offering insights into its practical benefits. Finally, we briefly discuss open problems and future directions for negative sampling.
