Table of Contents
Fetching ...

Truthful Aggregation of LLMs with an Application to Online Advertising

Ermis Soumalias, Michael J. Curry, Sven Seuken

TL;DR

MOSAIC is introduced, an auction mechanism that ensures that truthful reporting is a dominant strategy for advertisers and that aligns the utility of each advertiser with their contribution to social welfare and can incorporate contextual information about advertisers, which significantly improves social welfare.

Abstract

The next frontier of online advertising is revenue generation from LLM-generated content. We consider a setting where advertisers aim to influence the responses of an LLM to align with their interests, while platforms seek to maximize advertiser value and ensure user satisfaction. The challenge is that advertisers' preferences generally conflict with those of the user, and advertisers may misreport their preferences. To address this, we introduce MOSAIC, an auction mechanism that ensures that truthful reporting is a dominant strategy for advertisers and that aligns the utility of each advertiser with their contribution to social welfare. Importantly, the mechanism operates without LLM fine-tuning or access to model weights and provably converges to the output of the optimally fine-tuned LLM as computational resources increase. Additionally, it can incorporate contextual information about advertisers, which significantly improves social welfare. Through experiments with a publicly available LLM, we show that MOSAIC leads to high advertiser value and platform revenue with low computational overhead. While our motivating application is online advertising, our mechanism can be applied in any setting with monetary transfers, making it a general-purpose solution for truthfully aggregating the preferences of self-interested agents over LLM-generated replies.

Truthful Aggregation of LLMs with an Application to Online Advertising

TL;DR

MOSAIC is introduced, an auction mechanism that ensures that truthful reporting is a dominant strategy for advertisers and that aligns the utility of each advertiser with their contribution to social welfare and can incorporate contextual information about advertisers, which significantly improves social welfare.

Abstract

The next frontier of online advertising is revenue generation from LLM-generated content. We consider a setting where advertisers aim to influence the responses of an LLM to align with their interests, while platforms seek to maximize advertiser value and ensure user satisfaction. The challenge is that advertisers' preferences generally conflict with those of the user, and advertisers may misreport their preferences. To address this, we introduce MOSAIC, an auction mechanism that ensures that truthful reporting is a dominant strategy for advertisers and that aligns the utility of each advertiser with their contribution to social welfare. Importantly, the mechanism operates without LLM fine-tuning or access to model weights and provably converges to the output of the optimally fine-tuned LLM as computational resources increase. Additionally, it can incorporate contextual information about advertisers, which significantly improves social welfare. Through experiments with a publicly available LLM, we show that MOSAIC leads to high advertiser value and platform revenue with low computational overhead. While our motivating application is online advertising, our mechanism can be applied in any setting with monetary transfers, making it a general-purpose solution for truthfully aggregating the preferences of self-interested agents over LLM-generated replies.
Paper Structure (50 sections, 11 theorems, 40 equations, 16 figures, 2 tables, 1 algorithm)

This paper contains 50 sections, 11 theorems, 40 equations, 16 figures, 2 tables, 1 algorithm.

Key Result

Corollary 4.1

For any reported reward functions $r \in R$ by the advertisers and any LLM $\pi_{\text{gen}}$ such that $\pi_{\text{ref}}$ is absolutely continuous with respect to $\pi_{\text{gen}}$, the MOSAIC policy ${\pi_{r, M}(\cdot | x)}$ induced by alg:static_mechanism_improved using $M$ candidate replies con

Figures (16)

  • Figure 1: Reply log probability and total advertiser normalized reward as a function of the number of candidate sequences generated using $\pi_{\text{ref}}$ and $\pi_{\text{con}}$. Averaged over 1250 runs including 95% CIs.
  • Figure 2: Revenue as a function of the number of replies generated using $\pi_{\text{ref}}$ and $\pi_{\text{con}}$.
  • Figure 3: Reply log probability with respect to the reference LLM as a function of the number of replies generated using $\pi_{\text{ref}}$ and $\pi_{\text{con}}$.
  • Figure 4: Comparison of total advertiser utility gain from participation with, and without the payment offset, as a function of the number of candidate sequences generated using $\pi_{\text{ref}}$ and $\pi_{\text{con}}$. Averaged over 1250 runs including 95% CIs.
  • Figure 5: Comparative scatter plots of advertiser reward and utility gain from participation, with and without the payment offset of \ref{['section:bidding zero offset']} for candidate sequences generated by the context-aware LLM $\pi_{\text{gem}}$. We additionally show a linear regressor fit to that data, its slope and its $R^2$.
  • ...and 11 more figures

Theorems & Definitions (23)

  • Definition 3.1: Strategyproof
  • Corollary 4.1
  • Lemma 4.1
  • Theorem 5.1
  • Remark 1
  • Lemma 5.2
  • Remark 2
  • Theorem A.1
  • proof : \ref{['thm:context_aware_mech_induced_policy']} Proof.
  • proof : \ref{['Corollary:mechanism_follows_optimal_distribution']} Proof.
  • ...and 13 more