Language Models Can Reduce Asymmetry in Information Markets

Nasim Rahaman; Martin Weiss; Manuel Wüthrich; Yoshua Bengio; Li Erran Li; Chris Pal; Bernhard Schölkopf

Language Models Can Reduce Asymmetry in Information Markets

Nasim Rahaman, Martin Weiss, Manuel Wüthrich, Yoshua Bengio, Li Erran Li, Chris Pal, Bernhard Schölkopf

TL;DR

This work introduces an open-source simulated digital marketplace where intelligent agents, powered by language models, buy and sell information on behalf of external participants, and addresses the buyer's inspection paradox for information markets.

Abstract

This work addresses the buyer's inspection paradox for information markets. The paradox is that buyers need to access information to determine its value, while sellers need to limit access to prevent theft. To study this, we introduce an open-source simulated digital marketplace where intelligent agents, powered by language models, buy and sell information on behalf of external participants. The central mechanism enabling this marketplace is the agents' dual capabilities: they not only have the capacity to assess the quality of privileged information but also come equipped with the ability to forget. This ability to induce amnesia allows vendors to grant temporary access to proprietary information, significantly reducing the risk of unauthorized retention while enabling agents to accurately gauge the information's relevance to specific queries or tasks. To perform well, agents must make rational decisions, strategically explore the marketplace through generated sub-queries, and synthesize answers from purchased information. Concretely, our experiments (a) uncover biases in language models leading to irrational behavior and evaluate techniques to mitigate these biases, (b) investigate how price affects demand in the context of informational goods, and (c) show that inspection and higher budgets both lead to higher quality outcomes.

Language Models Can Reduce Asymmetry in Information Markets

TL;DR

Abstract

Paper Structure (15 sections, 1 theorem, 9 equations, 12 figures, 1 table)

This paper contains 15 sections, 1 theorem, 9 equations, 12 figures, 1 table.

Introduction
Related Work
The Information Bazaar
Principals and Agents
Tenders and Quotes
Following the Information Trail
Implementation Details
Experiments
Microeconomic Behavior of LLMs
Dynamics of the Information Bazaar
Higher Budget Improves Answer Quality.
Formal Result on the Impact of Inspection on Expected Utility
Computation of Elo Ratings
Dataset Statistics
Prompts

Key Result

Theorem 1

Given the Monotonicity in Information Assumption, if for more $i, z_i$ changes from $0$ to $1$, then:

Figures (12)

Figure 1: The Information Bazaar is a simulated marketplace for information. Principals authorize buyer agents to answer a query (a question and budget). The process starts with buyer agents posting tenders (requests for specific information) on a Bulletin Board. Vendor agents, holding information from various external sources, assess these tenders and may respond with quotes (i.e., their priced information offers). Buyer agents then evaluate these quotes. If they decide not to purchase specific information, then they immediately forget it, ensuring only purchased information is retained for further use. The cycle of posting tenders, receiving, and assessing quotes continues, with buyer agents optionally forming sub-queries based on purchased information to seek deep insights. The agents work within this framework until they compile satisfactory answers, exhaust their budget, or reach a pre-set tree-size limit in the Bazaar. The final step involves the buyer agents synthesizing a comprehensive answer for their principals, using only the information they have purchased. tl;dr In the Information Bazaar, agents continuously navigate information exchange, only retaining and utilizing purchased information to derive comprehensive and satisfactory answers within given constraints.
Figure 2: Positional Bias. We present permutations of three options to LLMs and track the acceptance rates by position. Results are normalized and mean-adjusted. tl;dr: All models exhibit order bias, with GPT-3.5 and Llama 2 70B showing more, and GPT-4 showing less.
Figure 3: Rational Choice with Fungible Information. (A) Same Price. Models face a choice between two identical and equally priced information goods. The rational choice is to buy one or neither, as both goods contain the same information. GPT-4 and Llama 2 (70B) consistently choose rationally. GPT-3.5, however, only acts rationally after an internal debate. Please refer to Figure \ref{['fig:rational-choice-same-price-bar']} for disaggregated results. tl;dr GPT-4 and Llama 2 (70B) make rational choices with identical goods; GPT-3.5 needs an internal debate to do the same. (B) Different Price. With one option now priced higher, both GPT-3.5 and Llama 2 (70B) show more errors, hinting at a preference for price over quality in selection. Despite this, internal debate proves to be a more reliable selection method. Please refer to Figure \ref{['fig:rational-choice-diff-price-bar']} for disaggregated results. tl;dr Higher pricing confuses GPT-3.5 and Llama 2 (70B); internal debate mitigates this issue largely for GPT-3.5 and somewhat for Llama 2 (70B).
Figure 4: How Price Affects Demand for the Gold Passage. We vary the price of the gold passage amid competing alternatives. Models are presented three options: two relevant passages and the gold passage, all initially priced at 10 credits. As the gold passage price rises, GPT-3.5 and GPT-4 increasingly opt for alternatives, exhibiting strong positive cross elasticity. Llama 2 (70B) shows a mild preference for mid-priced goods. tl;dr Higher gold passage prices push GPT-3.5 and GPT-4 toward alternatives, while Llama 2 (70B) leans towards mid-priced options.
Figure 5: Enhanced Answer Quality with Increased Budget (Left). This figure evaluates the answer quality from a Llama 2 (70B) agent across diverse budget allocations, permitting inspection. It presents the estimated Elo scores of answers correlated with varying budgets (higher scores signify superior answers; see Appendix \ref{['sec:elo-rating']} for details). tl;dr The results confirm that allocating more market credits to the agent positively impacts the relative answer quality. Inspection Improves Answer Quality (Right). This segment assesses the answer quality of a Llama 2 (70b) agent in the information bazaar, utilizing a GPT-4 simulated debate among domain experts for evaluation (refer to appendix for prompt details). In the “With Inspection” scenario, the Llama 2 agent is permitted to scrutinize a passage prior to purchase. Contrarily, the “Without Inspection” scenario limits the agent to viewing only the passage’s metadata, specifically, the paper and section titles. tl;dr: Allowing inspection delivers better value for the money spent, especially for larger budgets.
...and 7 more figures

Theorems & Definitions (1)

Theorem 1: Impact of Inspection on Expected Utility

Language Models Can Reduce Asymmetry in Information Markets

TL;DR

Abstract

Language Models Can Reduce Asymmetry in Information Markets

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (12)

Theorems & Definitions (1)