Table of Contents
Fetching ...

Pay-Per-Crawl Pricing for AI: The LM-Tree Agent

Richard Archer, Soheil Ghili, Nima Haghpanah

Abstract

As AI systems shift from directing users to content toward consuming it directly, publishers need a new revenue model: charging AI crawlers for content access. This model, called pay-per-crawl, must solve a problem of mechanism selection at scale: content is too heterogeneous for a fixed pricing framework. Different sub-types warrant not only different price levels but different pricing rules based on different unstructured features, and there are too many to enumerate or design by hand. We propose the LM Tree, an adaptive pricing agent that grows a segmentation tree over the content library, using LLMs to discover what distinguishes high-value from low-value items and apply those attributes at scale, from binary purchase feedback alone. We evaluate the LM Tree on real content from a major German technology publisher, using 8,939 articles and 80,451 buyer queries with willingness-to-pay calibrated from actual AI crawler traffic. The LM Tree achieves a 65% revenue gain over a single static price and a 47% gain over two-category pricing, outperforming even the publisher's own 8-segment editorial taxonomy by 40% -- recovering content distinctions the publisher's own categories miss.

Pay-Per-Crawl Pricing for AI: The LM-Tree Agent

Abstract

As AI systems shift from directing users to content toward consuming it directly, publishers need a new revenue model: charging AI crawlers for content access. This model, called pay-per-crawl, must solve a problem of mechanism selection at scale: content is too heterogeneous for a fixed pricing framework. Different sub-types warrant not only different price levels but different pricing rules based on different unstructured features, and there are too many to enumerate or design by hand. We propose the LM Tree, an adaptive pricing agent that grows a segmentation tree over the content library, using LLMs to discover what distinguishes high-value from low-value items and apply those attributes at scale, from binary purchase feedback alone. We evaluate the LM Tree on real content from a major German technology publisher, using 8,939 articles and 80,451 buyer queries with willingness-to-pay calibrated from actual AI crawler traffic. The LM Tree achieves a 65% revenue gain over a single static price and a 47% gain over two-category pricing, outperforming even the publisher's own 8-segment editorial taxonomy by 40% -- recovering content distinctions the publisher's own categories miss.

Paper Structure

This paper contains 35 sections, 9 equations, 1 figure, 10 tables.

Figures (1)

  • Figure 1: One cycle of the LM Tree at a single node. The LLM Analyst reads content from sets $H$ and $L$ to discover a pricing-relevant split rule; the LLM Annotator applies that rule to all items in the node. At inference time, routing requires only the pre-computed annotations---no LLM calls.