Learning to Defer to a Population: A Meta-Learning Approach
Dharmesh Tailor, Aditya Patra, Rajeev Verma, Putra Manggala, Eric Nalisnick
TL;DR
This work addresses the limitation of traditional L2D that assumes a fixed set of experts by introducing Learning to Defer to a Population (L2D-Pop), which enables safe delegation to never-before-seen experts drawn from a known population. It proposes two meta-learning avenues—optimization-based fine-tuning and model-based neural processes with attention—to adapt a deferral policy using a compact context set of expert demonstrations, along with consistent surrogate losses for population-based deferral. Theoretical contributions include a generative model for experts and Bayes-optimal deferral rules, plus surrogate losses (Softmax and OvA) that remain consistent under population assumptions. Empirically, L2D-Pop improves system accuracy over single-expert baselines, with the neural-process variant and cross-attention delivering the strongest gains as expert diversity increases, demonstrated on CIFAR-10, Traffic Signs, and HAM10000. The work enables robust, scalable human-in-the-loop systems that adapt to changing expert pools without retraining, with potential impact in medicine and safety-critical domains.
Abstract
The learning to defer (L2D) framework allows autonomous systems to be safe and robust by allocating difficult decisions to a human expert. All existing work on L2D assumes that each expert is well-identified, and if any expert were to change, the system should be re-trained. In this work, we alleviate this constraint, formulating an L2D system that can cope with never-before-seen experts at test-time. We accomplish this by using meta-learning, considering both optimization- and model-based variants. Given a small context set to characterize the currently available expert, our framework can quickly adapt its deferral policy. For the model-based approach, we employ an attention mechanism that is able to look for points in the context set that are similar to a given test point, leading to an even more precise assessment of the expert's abilities. In the experiments, we validate our methods on image recognition, traffic sign detection, and skin lesion diagnosis benchmarks.
