Table of Contents
Fetching ...

Efficient Budget Allocation for Large-Scale LLM-Enabled Virtual Screening

Zaile Li, Weiwei Fan, L. Jeff Hong

Abstract

Screening tasks that aim to identify a small subset of top alternatives from a large pool are common in business decision-making processes. These tasks often require substantial human effort to evaluate each alternative's performance, making them time-consuming and costly. Motivated by recent advances in large language models (LLMs), particularly their ability to generate outputs that align well with human evaluations, we consider an LLM-as-human-evaluator approach for conducting screening virtually, thereby reducing the cost burden. To achieve scalability and cost-effectiveness in virtual screening, we identify that the stochastic nature of LLM outputs and their cost structure necessitate efficient budget allocation across all alternatives. To address this, we propose using a top-$m$ greedy evaluation mechanism, a simple yet effective approach that keeps evaluating the current top-$m$ alternatives, and design the explore-first top-$m$ greedy (EFG-$m$) algorithm. We prove that EFG-$m$ is both sample-optimal and consistent in large-scale virtual screening. Surprisingly, we also uncover a bonus ranking effect, where the algorithm naturally induces an indifference-based ranking within the selected subset. To further enhance practicality, we design a suite of algorithm variants to improve screening performance and computational efficiency. Numerical experiments validate our results and demonstrate the effectiveness of our algorithms. Lastly, we conduct a case study on LLM-based virtual screening. The study shows that while LLMs alone may not provide meaningful screening and ranking results when directly queried, integrating them with our sample-optimal algorithms unlocks their potential for cost-effective, large-scale virtual screening.

Efficient Budget Allocation for Large-Scale LLM-Enabled Virtual Screening

Abstract

Screening tasks that aim to identify a small subset of top alternatives from a large pool are common in business decision-making processes. These tasks often require substantial human effort to evaluate each alternative's performance, making them time-consuming and costly. Motivated by recent advances in large language models (LLMs), particularly their ability to generate outputs that align well with human evaluations, we consider an LLM-as-human-evaluator approach for conducting screening virtually, thereby reducing the cost burden. To achieve scalability and cost-effectiveness in virtual screening, we identify that the stochastic nature of LLM outputs and their cost structure necessitate efficient budget allocation across all alternatives. To address this, we propose using a top- greedy evaluation mechanism, a simple yet effective approach that keeps evaluating the current top- alternatives, and design the explore-first top- greedy (EFG-) algorithm. We prove that EFG- is both sample-optimal and consistent in large-scale virtual screening. Surprisingly, we also uncover a bonus ranking effect, where the algorithm naturally induces an indifference-based ranking within the selected subset. To further enhance practicality, we design a suite of algorithm variants to improve screening performance and computational efficiency. Numerical experiments validate our results and demonstrate the effectiveness of our algorithms. Lastly, we conduct a case study on LLM-based virtual screening. The study shows that while LLMs alone may not provide meaningful screening and ranking results when directly queried, integrating them with our sample-optimal algorithms unlocks their potential for cost-effective, large-scale virtual screening.
Paper Structure (69 sections, 10 theorems, 86 equations, 14 figures, 9 tables, 3 algorithms)

This paper contains 69 sections, 10 theorems, 86 equations, 14 figures, 9 tables, 3 algorithms.

Key Result

Lemma 1

For any alternative $j\in\mathcal{G}$, its running average process $\{\bar{X}_j(n): n=n_0,n_0+1,\dots\}$ reaches its minimum within a finite number of observations almost surely, i.e., $\mathop{\rm arg\,min}_{n \in [n_0, \infty)}\bar{X}_j(n)<\infty$ almost surely.

Figures (14)

  • Figure 1: Evaluation process using LLMs to estimate willingness to pay for a laptop design
  • Figure 2: Top-2 greedy selection process of an example problem with $4$ alternatives where $\bar{X}_1(n) = \mu_1$ and $\bar{X}_2(n) = \mu_2, \forall n \geq 1$. Numbers in the markers represent the total sample sizes.
  • Figure 3: A comparison between top-2 and top-3 greedy selection processes of an example problem with $4$ alternatives where $\bar{X}_1(n) = \mu_1$ and $\bar{X}_2(n) = \mu_2, \forall n \geq 1$. Numbers in the markers represent the total sample sizes.
  • Figure EC.1: A comparison between top-2 and top-3 greedy selection processes of an example problem with $4$ alternatives. Numbers in the markers represent the total sample sizes.
  • Figure EC.2: A comparison between the PGS$_m$ of the EFG-$M$, SAR, and SAR-$M$ algorithms
  • ...and 9 more figures

Theorems & Definitions (22)

  • Definition 1
  • Definition 2
  • Remark 1
  • Remark 2
  • Remark 3
  • Lemma 1
  • Lemma 2
  • Theorem 1
  • Lemma 3
  • Theorem 2
  • ...and 12 more