Convergence Behavior of an Adversarial Weak Supervision Method
Steven An, Sanjoy Dasgupta
TL;DR
This paper investigates convergence properties of the adversarial weak supervision (BF) framework, which constructs a worst-case, log-loss minimax game over a coherence polytope ${P}$ defined by rule-accuracy and class-frequency bounds. The BF learner's prediction $g^{bf}$ is shown to be the maximum-entropy distribution in ${P}$ and lies in an exponential-family class ${\mathcal G}$, ultimately equating to a regularized multiclass logistic regression form. A detailed uncertainty decomposition reveals how BF’s consistency arises from the ability to drive approximation error to zero as the bounds tighten, and it yields a convergence rate $d(\eta, g^{bf}) \le d(\eta, g^{*}) + O(\|\epsilon\|_\infty)$. The paper also compares BF to the Dawid–Skene probabilistic approach, showing DS can be inconsistent in EM-driven settings, while BF can dominate under appropriate bound tightening. Experimental results on ten real datasets corroborate the theory, showing competitive or superior log-loss performance and illustrating the consistency phenomenon via convergence to the true label distribution in synthetic setups. Overall, the work provides a rigorous link between adversarial weak supervision and logistic-regression-like inference, with practical implications for reliable label aggregation when labeling functions are imperfect or abstain.
Abstract
Labeling data via rules-of-thumb and minimal label supervision is central to Weak Supervision, a paradigm subsuming subareas of machine learning such as crowdsourced learning and semi-supervised ensemble learning. By using this labeled data to train modern machine learning methods, the cost of acquiring large amounts of hand labeled data can be ameliorated. Approaches to combining the rules-of-thumb falls into two camps, reflecting different ideologies of statistical estimation. The most common approach, exemplified by the Dawid-Skene model, is based on probabilistic modeling. The other, developed in the work of Balsubramani-Freund and others, is adversarial and game-theoretic. We provide a variety of statistical results for the adversarial approach under log-loss: we characterize the form of the solution, relate it to logistic regression, demonstrate consistency, and give rates of convergence. On the other hand, we find that probabilistic approaches for the same model class can fail to be consistent. Experimental results are provided to corroborate the theoretical results.
