Minimizing Human Intervention in Online Classification
William Réveillard, Vasileios Saketos, Alexandre Proutiere, Richard Combes
TL;DR
The paper studies online classification with costly human input, formalizing a setting where each query embedding is drawn IID and the agent can query a human expert or guess the answer. It introduces geometry-driven algorithms (CHC, CC, GHC) that balance expert cost and predictive accuracy by exploiting convex/spherical regions in embedding space; CHC builds hull-based confidence regions to avoid mistakes, CC uses center estimation for small budgets, and GHC adds a tunable threshold to adaptively trade off risk. Theoretical results establish minimax-optimal regret bounds in certain regimes (notably CHC in $d=1$ and CHC/vanilla in higher dimensions under convex-geometry analysis), while CC provides sharp bounds in subgaussian mixtures for short horizons, and GHC bridges regimes with empirical gains. Experiments on synthetic data and real-world QA datasets with LLM embeddings demonstrate practical gains from aggressive guessing via GHC and show robustness to embedding dimensionality and density patterns, highlighting the approach’s potential to reduce costly human labeling in large-scale systems.
Abstract
We introduce and study an online problem arising in question answering systems. In this problem, an agent must sequentially classify user-submitted queries represented by $d$-dimensional embeddings drawn i.i.d. from an unknown distribution. The agent may consult a costly human expert for the correct label, or guess on her own without receiving feedback. The goal is to minimize regret against an oracle with free expert access. When the time horizon $T$ is at least exponential in the embedding dimension $d$, one can learn the geometry of the class regions: in this regime, we propose the Conservative Hull-based Classifier (CHC), which maintains convex hulls of expert-labeled queries and calls the expert as soon as a query lands outside all known hulls. CHC attains $\mathcal{O}(\log^d T)$ regret in $T$ and is minimax optimal for $d=1$. Otherwise, the geometry cannot be reliably learned without additional distributional assumptions. We show that when the queries are drawn from a subgaussian mixture, for $T \le e^d$, a Center-based Classifier (CC) achieves regret proportional to $N\log{N}$ where $N$ is the number of labels. To bridge these regimes, we introduce the Generalized Hull-based Classifier (GHC), a practical extension of CHC that allows for more aggressive guessing via a tunable threshold parameter. Our approach is validated with experiments, notably on real-world question-answering datasets using embeddings derived from state-of-the-art large language models.
