On Learning Latent Models with Multi-Instance Weak Supervision
Kaifu Wang, Efthymia Tsamoura, Dan Roth
TL;DR
This work formulates learning Latent models under multi-instance weak supervision (multi-instance PLL), where a deterministic transition $σ$ maps hidden labels from $M$ instances to a weak label $s$. It provides a theoretical foundation by introducing the necessary-and-sufficient $M$-unambiguity condition for learnability under known, and later unknown, transitions, and derives Rademacher-style error bounds using a top-$k$ semantic loss. The paper extends the theory from a single classifier to multiple classifiers and analyzes learning when $σ$ is unknown via an unambiguous transition-space $\mathcal{G}$, with bounded-risk assumptions. Empirical results on neuro-symbolic-style MNIST tasks validate the theory and highlight scalability challenges inherent to weak supervision with complex logical constraints. Overall, the work advances understanding of how multi-instance supervision and logical reasoning can be integrated with learning while offering concrete generalization guarantees and practical insights for scalable neuro-symbolic systems.
Abstract
We consider a weakly supervised learning scenario where the supervision signal is generated by a transition function $σ$ of labels associated with multiple input instances. We formulate this problem as \emph{multi-instance Partial Label Learning (multi-instance PLL)}, which is an extension to the standard PLL problem. Our problem is met in different fields, including latent structural learning and neuro-symbolic integration. Despite the existence of many learning techniques, limited theoretical analysis has been dedicated to this problem. In this paper, we provide the first theoretical study of multi-instance PLL with possibly an unknown transition $σ$. Our main contributions are as follows. Firstly, we propose a necessary and sufficient condition for the learnability of the problem. This condition non-trivially generalizes and relaxes the existing small ambiguity degree in the PLL literature, since we allow the transition to be deterministic. Secondly, we derive Rademacher-style error bounds based on a top-$k$ surrogate loss that is widely used in the neuro-symbolic literature. Furthermore, we conclude with empirical experiments for learning under unknown transitions. The empirical results align with our theoretical findings; however, they also expose the issue of scalability in the weak supervision literature.
