Learning to Defer to a Population: A Meta-Learning Approach

Dharmesh Tailor; Aditya Patra; Rajeev Verma; Putra Manggala; Eric Nalisnick

Learning to Defer to a Population: A Meta-Learning Approach

Dharmesh Tailor, Aditya Patra, Rajeev Verma, Putra Manggala, Eric Nalisnick

TL;DR

This work addresses the limitation of traditional L2D that assumes a fixed set of experts by introducing Learning to Defer to a Population (L2D-Pop), which enables safe delegation to never-before-seen experts drawn from a known population. It proposes two meta-learning avenues—optimization-based fine-tuning and model-based neural processes with attention—to adapt a deferral policy using a compact context set of expert demonstrations, along with consistent surrogate losses for population-based deferral. Theoretical contributions include a generative model for experts and Bayes-optimal deferral rules, plus surrogate losses (Softmax and OvA) that remain consistent under population assumptions. Empirically, L2D-Pop improves system accuracy over single-expert baselines, with the neural-process variant and cross-attention delivering the strongest gains as expert diversity increases, demonstrated on CIFAR-10, Traffic Signs, and HAM10000. The work enables robust, scalable human-in-the-loop systems that adapt to changing expert pools without retraining, with potential impact in medicine and safety-critical domains.

Abstract

The learning to defer (L2D) framework allows autonomous systems to be safe and robust by allocating difficult decisions to a human expert. All existing work on L2D assumes that each expert is well-identified, and if any expert were to change, the system should be re-trained. In this work, we alleviate this constraint, formulating an L2D system that can cope with never-before-seen experts at test-time. We accomplish this by using meta-learning, considering both optimization- and model-based variants. Given a small context set to characterize the currently available expert, our framework can quickly adapt its deferral policy. For the model-based approach, we employ an attention mechanism that is able to look for points in the context set that are similar to a given test point, leading to an even more precise assessment of the expert's abilities. In the experiments, we validate our methods on image recognition, traffic sign detection, and skin lesion diagnosis benchmarks.

Learning to Defer to a Population: A Meta-Learning Approach

TL;DR

Abstract

Paper Structure (50 sections, 22 equations, 11 figures, 2 tables, 1 algorithm)

This paper contains 50 sections, 22 equations, 11 figures, 2 tables, 1 algorithm.

INTRODUCTION
BACKGROUND
Single-Expert Setting
Data & Models
Learning
Softmax Surrogate
Multi-Expert Setting
Data & Model
Learning
Softmax Surrogate Loss
Meta-Learning
Meta-Learning via Optimization
Neural Processes
LEARNING TO DEFER TO A POPULATION
Theoretical Formulation
...and 35 more sections

Figures (11)

Figure 1: Attentive Encoder of Expert's Context Set. The above diagram shows how an expert's context set is summarized into a representation. The cross-attention mechanism allows points in the context set to be emphasized if they are similar to the current query point. In the example above, images of cars would be emphasized to determine if this expert performs well at classifying cars.
Figure 2: Synthetic 2D Data. We simulate three clusters, two having class purity and a third having a mixture of two classes. Furthermore, we simulate three experts and show the model's decision regions for the worst ($1\%$) and best ($95\%$). The dashed line is where single-expert L2D defers; it is constant across experts. The red region is where L2D-Pop defers; it successfully adapts to the expert by never deferring the former case and deferring the whole of the difficult cluster in the latter case.
Figure 3: Varying Population Diversity on Image Classification Tasks. L2D-Pop exploits experts' context sets to make better deferment decisions given by the increase in expert accuracy on deferred examples (bottom). This leads to a boost in overall system accuracy (top). The gap widens as the overlap in experts' abilities decreases.
Figure 4: L2D-Pop implemented with an attentive neural process (black) boosts performance when experts' abilities are specified by side-information (fine-grained labels) not provided in the context set.
Figure 5: Varying Population Diversity on Image Classification Tasks with OvA surrogate loss.
...and 6 more figures

Learning to Defer to a Population: A Meta-Learning Approach

TL;DR

Abstract

Learning to Defer to a Population: A Meta-Learning Approach

Authors

TL;DR

Abstract

Table of Contents

Figures (11)