Table of Contents
Fetching ...

Menu Pricing of Large Language Models

Dirk Bergemann, Alessandro Bonatti, Alex Smolin

TL;DR

This work develops a framework for the optimal pricing and product design of LLMs in which a provider sells menus of token budgets to users who differ in their valuations across a continuum of tasks, and shows that competitive pressure reshapes both the intensive and extensive margins of compute provision.

Abstract

We develop a framework for the optimal pricing and product design of LLMs in which a provider sells menus of token budgets to users who differ in their valuations across a continuum of tasks. Under a homogeneous production technology, we show that users' high-dimensional type profiles are summarized by a scalar index, reducing the seller's problem to one-dimensional screening. The optimal mechanism takes the form of committed-spend contracts: buyers pay for a budget that they allocate across token classes priced at marginal cost. We extend the analysis to environments with multiple differentiated models and to competition between a proprietary leader and an open-source fringe, showing that competitive pressure reshapes both the intensive and extensive margins of compute provision. Each element of our theory (token-budget menus, maximum- and minimum-spend plans, multi-model versioning, and linear API pricing) has a direct counterpart in the observed pricing practices of providers such as Anthropic, OpenAI, and GitHub.

Menu Pricing of Large Language Models

TL;DR

This work develops a framework for the optimal pricing and product design of LLMs in which a provider sells menus of token budgets to users who differ in their valuations across a continuum of tasks, and shows that competitive pressure reshapes both the intensive and extensive margins of compute provision.

Abstract

We develop a framework for the optimal pricing and product design of LLMs in which a provider sells menus of token budgets to users who differ in their valuations across a continuum of tasks. Under a homogeneous production technology, we show that users' high-dimensional type profiles are summarized by a scalar index, reducing the seller's problem to one-dimensional screening. The optimal mechanism takes the form of committed-spend contracts: buyers pay for a budget that they allocate across token classes priced at marginal cost. We extend the analysis to environments with multiple differentiated models and to competition between a proprietary leader and an open-source fringe, showing that competitive pressure reshapes both the intensive and extensive margins of compute provision. Each element of our theory (token-budget menus, maximum- and minimum-spend plans, multi-model versioning, and linear API pricing) has a direct counterpart in the observed pricing practices of providers such as Anthropic, OpenAI, and GitHub.

Paper Structure

This paper contains 46 sections, 16 theorems, 136 equations, 3 figures, 4 tables.

Key Result

Proposition 1

Under the efficient allocation, all buyer types $w$ with the same aggregate type $\theta(w)$ consume the same number of fine-tuning tokens, consume the same total number of inference tokens in each class, and obtain the same total payoff. The number of inference tokens allocated to task $i$ is propo

Figures (3)

  • Figure 1: Leader-Fringe allocations across different regimes. Example with $\theta\sim U[0,1]$, $\sigma=1/2$, $\hat{\sigma}_L=1/4$, $c_F=1/10$, $c_L=1/8$.
  • Figure 2: Anthropic subscription tiers (January 2026). Model access is constant across paid tiers; differentiation occurs through usage allocations.
  • Figure 3: OpenAI ChatGPT subscription tiers (January 2026). Higher tiers grant access to more capable models and larger usage allocations.

Theorems & Definitions (26)

  • Proposition 1: Efficient Allocation
  • Corollary 1: Constrained Efficient Allocation
  • Proposition 2: Buyer Indirect Utility
  • Lemma 1: Cost Function
  • Proposition 3: Optimal Menu
  • Definition 1: Maximum-Spend Mechanism
  • Definition 2: Minimum-Spend Mechanism
  • Definition 3: Two-Part-Tariff Mechanism
  • Proposition 4: Indirect Implementation
  • Lemma 2: Buyer-Optimal Payoff
  • ...and 16 more