Table of Contents
Fetching ...

Budgeted Multiple-Expert Deferral

Giulia DeSalvo, Clara Mohri, Mehryar Mohri, Yutao Zhong

TL;DR

The paper tackles training-time costs in learning-to-defer with multiple experts by introducing a budgeted deferral framework that selectively queries expert costs. It develops a two-stage IWAL-inspired algorithm with a Sampling-Probs subroutine that prunes a version space and assigns query probabilities based on hypothesis disagreement, enabling strong generalization guarantees and favorable label complexity. Theoretical results show a square-root-type (sublinear) growth in label complexity in realizable settings and favorable dependence on disagreement metrics, while practical convex-optimization strategies support scalable implementations. Empirical evaluation across ten datasets demonstrates that the budgeted approach closely matches full-query baselines in accuracy while substantially reducing the number of queried expert costs, highlighting its practical value for resource-constrained deployments, including large language models and human annotators.

Abstract

Learning to defer uncertain predictions to costly experts offers a powerful strategy for improving the accuracy and efficiency of machine learning systems. However, standard training procedures for deferral algorithms typically require querying all experts for every training instance, an approach that becomes prohibitively expensive when expert queries incur significant computational or resource costs. This undermines the core goal of deferral: to limit unnecessary expert usage. To overcome this challenge, we introduce the budgeted deferral framework, which aims to train effective deferral algorithms while minimizing expert query costs during training. We propose new algorithms for both two-stage and single-stage multiple-expert deferral settings that selectively query only a subset of experts per training example. While inspired by active learning, our setting is fundamentally different: labels are already known, and the core challenge is to decide which experts to query in order to balance cost and predictive performance. We establish theoretical guarantees for both of our algorithms, including generalization bounds and label complexity analyses. Empirical results across several domains show that our algorithms substantially reduce training costs without sacrificing prediction accuracy, demonstrating the practical value of our budget-aware deferral algorithms.

Budgeted Multiple-Expert Deferral

TL;DR

The paper tackles training-time costs in learning-to-defer with multiple experts by introducing a budgeted deferral framework that selectively queries expert costs. It develops a two-stage IWAL-inspired algorithm with a Sampling-Probs subroutine that prunes a version space and assigns query probabilities based on hypothesis disagreement, enabling strong generalization guarantees and favorable label complexity. Theoretical results show a square-root-type (sublinear) growth in label complexity in realizable settings and favorable dependence on disagreement metrics, while practical convex-optimization strategies support scalable implementations. Empirical evaluation across ten datasets demonstrates that the budgeted approach closely matches full-query baselines in accuracy while substantially reducing the number of queried expert costs, highlighting its practical value for resource-constrained deployments, including large language models and human annotators.

Abstract

Learning to defer uncertain predictions to costly experts offers a powerful strategy for improving the accuracy and efficiency of machine learning systems. However, standard training procedures for deferral algorithms typically require querying all experts for every training instance, an approach that becomes prohibitively expensive when expert queries incur significant computational or resource costs. This undermines the core goal of deferral: to limit unnecessary expert usage. To overcome this challenge, we introduce the budgeted deferral framework, which aims to train effective deferral algorithms while minimizing expert query costs during training. We propose new algorithms for both two-stage and single-stage multiple-expert deferral settings that selectively query only a subset of experts per training example. While inspired by active learning, our setting is fundamentally different: labels are already known, and the core challenge is to decide which experts to query in order to balance cost and predictive performance. We establish theoretical guarantees for both of our algorithms, including generalization bounds and label complexity analyses. Empirical results across several domains show that our algorithms substantially reduce training costs without sacrificing prediction accuracy, demonstrating the practical value of our budget-aware deferral algorithms.

Paper Structure

This paper contains 55 sections, 21 theorems, 96 equations, 2 figures, 1 table, 4 algorithms.

Key Result

theorem 1

Let ${\mathscr D}$ be any distribution over ${\mathscr X} \times {\mathscr Y} \times \{*\}{0, 1}^{{n_e}}$, and let ${\mathscr R}$ be a hypothesis class. Assume that $r^* \in {\mathscr R}$ minimizes the expected surrogate loss ${\mathscr E}(r)$. Then, for any $\delta > 0$, with probability at least $ In particular, the learned hypothesis $r_T$ at time $T$ satisfies

Figures (2)

  • Figure 1: Standard vs. Budgeted Two-Stage Multiple-Expert Deferral on Binary Datasets.
  • Figure 2: Standard vs. Budgeted Two-Stage Multiple-Expert Deferral on Multi-Class Datasets.

Theorems & Definitions (46)

  • theorem 1: Two-Stage Generalization Bound
  • definition 1: Slope Asymmetry for Two-Stage Deferral
  • definition 2: Hypothesis Distance Metric
  • lemma 1
  • definition 3
  • theorem 2: Two-Stage Label Complexity Bound
  • lemma 2: Lipschitz upper bound
  • proof
  • lemma 3: Coercivity on the zero-mean subspace
  • proof
  • ...and 36 more