Table of Contents
Fetching ...

Minimizing Human Intervention in Online Classification

William Réveillard, Vasileios Saketos, Alexandre Proutiere, Richard Combes

TL;DR

The paper studies online classification with costly human input, formalizing a setting where each query embedding is drawn IID and the agent can query a human expert or guess the answer. It introduces geometry-driven algorithms (CHC, CC, GHC) that balance expert cost and predictive accuracy by exploiting convex/spherical regions in embedding space; CHC builds hull-based confidence regions to avoid mistakes, CC uses center estimation for small budgets, and GHC adds a tunable threshold to adaptively trade off risk. Theoretical results establish minimax-optimal regret bounds in certain regimes (notably CHC in $d=1$ and CHC/vanilla in higher dimensions under convex-geometry analysis), while CC provides sharp bounds in subgaussian mixtures for short horizons, and GHC bridges regimes with empirical gains. Experiments on synthetic data and real-world QA datasets with LLM embeddings demonstrate practical gains from aggressive guessing via GHC and show robustness to embedding dimensionality and density patterns, highlighting the approach’s potential to reduce costly human labeling in large-scale systems.

Abstract

We introduce and study an online problem arising in question answering systems. In this problem, an agent must sequentially classify user-submitted queries represented by $d$-dimensional embeddings drawn i.i.d. from an unknown distribution. The agent may consult a costly human expert for the correct label, or guess on her own without receiving feedback. The goal is to minimize regret against an oracle with free expert access. When the time horizon $T$ is at least exponential in the embedding dimension $d$, one can learn the geometry of the class regions: in this regime, we propose the Conservative Hull-based Classifier (CHC), which maintains convex hulls of expert-labeled queries and calls the expert as soon as a query lands outside all known hulls. CHC attains $\mathcal{O}(\log^d T)$ regret in $T$ and is minimax optimal for $d=1$. Otherwise, the geometry cannot be reliably learned without additional distributional assumptions. We show that when the queries are drawn from a subgaussian mixture, for $T \le e^d$, a Center-based Classifier (CC) achieves regret proportional to $N\log{N}$ where $N$ is the number of labels. To bridge these regimes, we introduce the Generalized Hull-based Classifier (GHC), a practical extension of CHC that allows for more aggressive guessing via a tunable threshold parameter. Our approach is validated with experiments, notably on real-world question-answering datasets using embeddings derived from state-of-the-art large language models.

Minimizing Human Intervention in Online Classification

TL;DR

The paper studies online classification with costly human input, formalizing a setting where each query embedding is drawn IID and the agent can query a human expert or guess the answer. It introduces geometry-driven algorithms (CHC, CC, GHC) that balance expert cost and predictive accuracy by exploiting convex/spherical regions in embedding space; CHC builds hull-based confidence regions to avoid mistakes, CC uses center estimation for small budgets, and GHC adds a tunable threshold to adaptively trade off risk. Theoretical results establish minimax-optimal regret bounds in certain regimes (notably CHC in and CHC/vanilla in higher dimensions under convex-geometry analysis), while CC provides sharp bounds in subgaussian mixtures for short horizons, and GHC bridges regimes with empirical gains. Experiments on synthetic data and real-world QA datasets with LLM embeddings demonstrate practical gains from aggressive guessing via GHC and show robustness to embedding dimensionality and density patterns, highlighting the approach’s potential to reduce costly human labeling in large-scale systems.

Abstract

We introduce and study an online problem arising in question answering systems. In this problem, an agent must sequentially classify user-submitted queries represented by -dimensional embeddings drawn i.i.d. from an unknown distribution. The agent may consult a costly human expert for the correct label, or guess on her own without receiving feedback. The goal is to minimize regret against an oracle with free expert access. When the time horizon is at least exponential in the embedding dimension , one can learn the geometry of the class regions: in this regime, we propose the Conservative Hull-based Classifier (CHC), which maintains convex hulls of expert-labeled queries and calls the expert as soon as a query lands outside all known hulls. CHC attains regret in and is minimax optimal for . Otherwise, the geometry cannot be reliably learned without additional distributional assumptions. We show that when the queries are drawn from a subgaussian mixture, for , a Center-based Classifier (CC) achieves regret proportional to where is the number of labels. To bridge these regimes, we introduce the Generalized Hull-based Classifier (GHC), a practical extension of CHC that allows for more aggressive guessing via a tunable threshold parameter. Our approach is validated with experiments, notably on real-world question-answering datasets using embeddings derived from state-of-the-art large language models.

Paper Structure

This paper contains 58 sections, 24 theorems, 142 equations, 8 figures, 9 tables, 2 algorithms.

Key Result

Theorem 4.1

Under Assumptions as:cells(i) and as:density(i): (a) if $\mathcal{E}=\mathcal{I}^d$ and $d \ge 2$, then the regret of CHC satisfies (b) if $\mathcal{E}=\mathcal{S}^{d-1}$, $d \ge 3$ and each cell $\mathcal{C}_i$ is contained in an open halfsphere $\mathcal{S}_{e_i}^{+}$, then the regret of CHC satisfies where $K= \max_{i \in [N]}\left(\frac{\max_{y \in \mathcal{C}_i} y^{\top}e_i}{\min_{y \in \ma

Figures (8)

  • Figure 1: Voronoi tessellation of $\mathcal{S}^2$. In blue, gnomonic projection of a cell onto the tangent plane at its seed (used in Section \ref{['sec:regretVHC']}).
  • Figure 2: Hulls $\hat{\cal C}_{i,t}$ of CHC at $t=200$. $\mu$ is a mixture of truncated Gaussian distributions with equal weights and covariance matrix $0.01I$. Stars are the seeds, circles are the queries that required an expert call.
  • Figure 3: Decision regions of GHC($\tau$) for a mixture of truncated Gaussian distributions, covariance matrix $0.0025I$.
  • Figure 4: Comparison of GHC using different LLMs on Quora Question Groups dataset.
  • Figure 5: Lower bound construction in dimension one, $\phi_{ij}=(s_i+s_j)/2$
  • ...and 3 more figures

Theorems & Definitions (49)

  • Theorem 4.1
  • Corollary 4.2
  • Theorem 4.3
  • Theorem 4.4: Lower bound on the minimax regret
  • Theorem 4.5
  • Remark B.1
  • Proposition B.2: Theorem 2 from barany1993
  • Corollary B.3
  • proof : Proof of Theorem \ref{['thm:d-dimensional-regret']}(a)
  • proof : Proof of Corollary \ref{['cor:thinned_polytope_volume']}
  • ...and 39 more