Optimal Learning from Label Proportions with General Loss Functions
Lorne Applebaum, Travis Dick, Claudio Gentile, Haim Kaplan, Tomer Koren
TL;DR
The paper tackles Learning from Label Proportions where training labels are available only as bag-level proportions. It introduces a low-variance debiasing method that builds unbiased bag-level estimators for general loss functions across binary and multiclass tasks, with a variance bound independent of bag size $k$. It then develops a Median-of-Means tournament to select hypotheses using pairwise loss differences, achieving regret bounds with sample complexity that scales with $k$, the class count $c$, and the number of bags $m$. Empirical results on MNIST, CIFAR-10, Higgs, Adult, and Criteo demonstrate strong performance, especially for large bag sizes, and establish competitive baselines in both batch and online settings. This framework broadens LLP applicability to practical losses and real-world large-scale datasets such as online advertising conversion prediction.
Abstract
Motivated by problems in online advertising, we address the task of Learning from Label Proportions (LLP). We introduce a novel and versatile low-variance debiasing methodology to learn from aggregate label information, significantly advancing the state of the art in LLP. Our debiasing approach exhibits remarkable flexibility, seamlessly accommodating a broad spectrum of practically relevant loss functions across both binary and multi-class classification settings. By carefully combining our estimators with standard techniques, we improve sample complexity guarantees for a large class of losses of practical relevance. We also empirically validate the efficacy of our proposed approach across a diverse array of benchmark datasets, demonstrating compelling empirical advantages over standard baselines.
