Table of Contents
Fetching ...

Adaptive Contracts for Cost-Effective AI Delegation

Eden Saig, Tamar Garbuz, Ariel D. Procaccia, Inbal Talgam-Cohen, Jamie Tucker-Foltz

Abstract

When organizations delegate text generation tasks to AI providers via pay-for-performance contracts, expected payments rise when evaluation is noisy. As evaluation methods become more elaborate, the economic benefits of decreased noise are often overshadowed by increased evaluation costs. In this work, we introduce adaptive contracts for AI delegation, which allow detailed evaluation to be performed selectively after observing an initial coarse signal in order to conserve resources. We make three sets of contributions: First, we provide efficient algorithms for computing optimal adaptive contracts under natural assumptions or when core problem dimensions are small, and prove hardness of approximation in the general unstructured case. We then formulate alternative models of randomized adaptive contracts and discuss their benefits and limitations. Finally, we empirically demonstrate the benefits of adaptivity over non-adaptive baselines using question-answering and code-generation datasets.

Adaptive Contracts for Cost-Effective AI Delegation

Abstract

When organizations delegate text generation tasks to AI providers via pay-for-performance contracts, expected payments rise when evaluation is noisy. As evaluation methods become more elaborate, the economic benefits of decreased noise are often overshadowed by increased evaluation costs. In this work, we introduce adaptive contracts for AI delegation, which allow detailed evaluation to be performed selectively after observing an initial coarse signal in order to conserve resources. We make three sets of contributions: First, we provide efficient algorithms for computing optimal adaptive contracts under natural assumptions or when core problem dimensions are small, and prove hardness of approximation in the general unstructured case. We then formulate alternative models of randomized adaptive contracts and discuss their benefits and limitations. Finally, we empirically demonstrate the benefits of adaptivity over non-adaptive baselines using question-answering and code-generation datasets.
Paper Structure (55 sections, 19 theorems, 56 equations, 8 figures, 1 table)

This paper contains 55 sections, 19 theorems, 56 equations, 8 figures, 1 table.

Key Result

Theorem 3.2

Consider adaptive contract settings with constant-many actions. Then an optimal adaptive contract can be computed in polynomial time.

Figures (8)

  • Figure 1: Principal-agent interaction in an abstract adaptive contract setting. Colored text describes the application to AI task delegation.
  • Figure 2: Relationships between expected payments in optimal deterministic versus nondeterministic contracts. Note that the strict inequality holds only when the optimal contract inspects.
  • Figure 3: Our four nondeterministic problem variants, partially ordered by optimal principal's utility.
  • Figure 4: Empirical evaluation of adaptive contracts on AlpacaEval data (\ref{['sec:alpacaeval']}). (Top) Outcome distributions. Initial signal is length thresholding, and inspection provides human-annotated pairwise comparison to a reference output generated by GPT-4. (Bottom Left) Optimal adaptive contract. The contract inspects short outputs, and rewards concise answers which are better than the GPT-4 reference. (Bottom Right) Comparison of principal utility across different types of contracts. Adaptive contracts yield significantly higher utility to the principal in this setting.
  • Figure 5: Adaptive contracts on SWE-Bench data (\ref{['sec:swebench']}). (Left) Optimal inspection policy as a function of inspection cost, showing transition between three optimal policies as $d$ grows. (Right) Information design optimization heat map. Cells represent cost of optimal contract given initial and refined test counts.
  • ...and 3 more figures

Theorems & Definitions (41)

  • Example 1.1: Costly secondary evaluation
  • Example 1.2: Adaptive randomized evaluation
  • Example 3.1
  • Theorem 3.2
  • Definition 3.3: ISOP
  • Definition 3.4: Symmetric-ISOP
  • Theorem 3.5
  • Theorem 4.1
  • proof : Proof sketch
  • Theorem 4.2
  • ...and 31 more