Sequential Selection with Expirations

Yihua Xu; Rohan Ghuge; Sebastian Perez-Salazar

Sequential Selection with Expirations

Yihua Xu, Rohan Ghuge, Sebastian Perez-Salazar

TL;DR

Sequential Selection with Expirations (SSE) studies online decision-making where options may expire stochastically while evaluation times are uncertain, with complete independence across options and known distributions. The authors introduce a time-indexed LP relaxation that upper-bounds the best online policy and a polynomial-time rounding scheme yielding a $(0.5)igl(1-rac{1}{e}igr)$-approximation to the LP value (and hence to the online optimum). In the special case of iid evaluation times, a simple greedy policy that always picks the highest-valued available option achieves a $1/2$-approximation, and this bound is tight within that class. The framework extends naturally to deadlines and knapsack constraints, preserving similar approximation guarantees. Empirically, the LP-based policies perform robustly on synthetic and real datasets (including active search with LLM performance data and call-center logs), validating practical applicability and demonstrating the balance between high-value options and shorter evaluations in dynamic, uncertain environments.

Abstract

Motivated by applications where impatience is pervasive and evaluation times are uncertain, we study a selection model where options may expire at an unknown point in time and evaluation times are stochastic. Initially, the decision-maker (DM) has access to $n$ options with known non-negative values: these options have unknown stochastic evaluation and expiration times with known distributional information, which we assume to be independent. When the DM is free, we can select an available option that occupies the DM for an unknown amount of time and collect its value. The objective is to maximize the expected total value obtained from options selected by the DM. Natural formulations of this problem suffer from the curse of dimensionality. In fact, this problem is NP-hard even in the deterministic case. Hence, we focus on efficiently computable approximation algorithms that can provide high expected reward compared to the optimal expected value. Towards this end, we first provide a compact linear programming (LP) relaxation that gives an upper bound on the expected value obtained by the optimal policy. Then we design a polynomial-time algorithm that is nearly a $(1/2)\cdot (1-1/e)$-approximation to the optimal LP value (so also to the optimal expected value). We next shift our focus to the case of independent and identically distributed (i.i.d.) evaluation times. In this case, we show that the greedy policy that always selects the highest-valued option whenever the DM is free obtains a $1/2$-approximation to the optimal expected value. Our approaches extend effortlessly, and we demonstrate their flexibility by providing approximations to natural extensions of our problem. Finally, we evaluate our LP-based policies and the greedy policy empirically on synthetic and real datasets.

Sequential Selection with Expirations

TL;DR

-approximation to the LP value (and hence to the online optimum). In the special case of iid evaluation times, a simple greedy policy that always picks the highest-valued available option achieves a

-approximation, and this bound is tight within that class. The framework extends naturally to deadlines and knapsack constraints, preserving similar approximation guarantees. Empirically, the LP-based policies perform robustly on synthetic and real datasets (including active search with LLM performance data and call-center logs), validating practical applicability and demonstrating the balance between high-value options and shorter evaluations in dynamic, uncertain environments.

Abstract

options with known non-negative values: these options have unknown stochastic evaluation and expiration times with known distributional information, which we assume to be independent. When the DM is free, we can select an available option that occupies the DM for an unknown amount of time and collect its value. The objective is to maximize the expected total value obtained from options selected by the DM. Natural formulations of this problem suffer from the curse of dimensionality. In fact, this problem is NP-hard even in the deterministic case. Hence, we focus on efficiently computable approximation algorithms that can provide high expected reward compared to the optimal expected value. Towards this end, we first provide a compact linear programming (LP) relaxation that gives an upper bound on the expected value obtained by the optimal policy. Then we design a polynomial-time algorithm that is nearly a

-approximation to the optimal LP value (so also to the optimal expected value). We next shift our focus to the case of independent and identically distributed (i.i.d.) evaluation times. In this case, we show that the greedy policy that always selects the highest-valued option whenever the DM is free obtains a

-approximation to the optimal expected value. Our approaches extend effortlessly, and we demonstrate their flexibility by providing approximations to natural extensions of our problem. Finally, we evaluate our LP-based policies and the greedy policy empirically on synthetic and real datasets.

Paper Structure (39 sections, 19 theorems, 37 equations, 2 figures, 4 tables, 1 algorithm)

This paper contains 39 sections, 19 theorems, 37 equations, 2 figures, 4 tables, 1 algorithm.

Introduction
Problem Definition
Results and Techniques
Selected Applications
Organization.
Related Work
LP-based Algorithms for SSE
The Linear Programming Relaxation
The Algorithm and its Analysis
Proof of \ref{['thm:appx-bound']}.
Proof of \ref{['lem:key']}
Limitations of Our Algorithm and Analysis
Constrained Sequential Selection with Expiration
Greedy Policies for I.I.D. Evaluation Times
Step 1: Designing the intermediate policy.
...and 24 more sections

Key Result

Theorem 1.1

For any $\epsilon > 0$, there exists a polynomial-time algorithm that achieves a $\left(0.316-\epsilon\right)$-approximation for the sequential selection with expiration problem, relative to the online optimum.

Figures (2)

Figure 1: Comparison of coupled values. In this example, $1=t_1< 2 < t_2$ and so $\mathtt{ALG}^2$ does not change the action in $t_2$.
Figure 2:

Theorems & Definitions (39)

Remark 1.1: Uncertain and Time Dependent Values
Theorem 1.1
Theorem 1.2
Example 1.1
Example 1.2
Example 1.3
Theorem 1.3
Theorem 1.4
Theorem 1.5
Theorem 3.1
...and 29 more

Sequential Selection with Expirations

TL;DR

Abstract

Sequential Selection with Expirations

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (39)