Active Learning For Contextual Linear Optimization: A Margin-Based Approach
Mo Liu, Paul Grigas, Heyuan Liu, Zuo-Jun Max Shen
TL;DR
This work introduces MBAL-SPO, the first active-learning framework tailored for contextual linear optimization where the objective coefficients are unknown and must be inferred from features. By leveraging the SPO loss and its SPO+ surrogate, the method uses a margin-based distance to degeneracy to decide when to acquire labels, achieving lower label complexity than fully supervised approaches. The authors provide non-asymptotic excess-risk bounds for both SPO and surrogate losses, along with detailed analyses for hard and soft rejection variants and separable SPO+ scenarios, under natural margin conditions. Empirical studies on shortest path and personalized pricing demonstrate that MBAL-SPO achieves substantially lower SPO risk with fewer labeled samples, validating its practical value in data-driven decision making. The work advances prescriptive analytics by integrating margin-based active learning with decision-focused learning, offering scalable guarantees and concrete guidance for label acquisition in cost-sensitive optimization tasks.
Abstract
We develop the first active learning method for contextual linear optimization. Specifically, we introduce a label acquisition algorithm that sequentially decides whether to request the ``labels'' of feature samples from an unlabeled data stream, where the labels correspond to the coefficients of the objective in the linear optimization. Our method is the first to be directly informed by the decision loss induced by the predicted coefficients, referred to as the Smart Predict-then-Optimize (SPO) loss. Motivated by the structure of the SPO loss, our algorithm adopts a margin-based criterion utilizing the concept of distance to degeneracy. In particular, we design an efficient active learning algorithm with theoretical excess risk (i.e., generalization) guarantees. We derive upper bounds on the label complexity, defined as the number of samples whose labels are acquired to achieve a desired small level of SPO risk. These bounds show that our algorithm has a much smaller label complexity than the naive supervised learning approach that labels all samples, particularly when the SPO loss is minimized directly on the collected data. To address the discontinuity and nonconvexity of the SPO loss, we derive label complexity bounds under tractable surrogate loss functions. Under natural margin conditions, these bounds also outperform naive supervised learning. Using the SPO+ loss, a specialized surrogate of the SPO loss, we establish even tighter bounds under separability conditions. Finally, we present numerical evidence showing the practical value of our algorithms in settings such as personalized pricing and the shortest path problem.
