Table of Contents
Fetching ...

Strategy-robust Online Learning in Contextual Pricing

Joon Suk Huh, Kirthevasan Kandasamy

TL;DR

This work studies online contextual pricing when buyers’ valuations are unknown and can be manipulated strategically. It introduces a strategy-robust regret framework—a Price-of-Anarchy style metric—for multi-buyer online pricing, and provides a PTAS for truthful buyers via a novel online sketching approach that reduces the action space to a compact set of sketches. To achieve robustness to Nash equilibria among strategic buyers, the Sparse Update Mechanism (SUM) randomly limits learning updates and occasionally prices randomly, yielding a bound on strategy-robust regret and enabling a black-box reduction from any no-regret online expert algorithm. A key theoretical contribution is showing computational hardness for sublinear regret in OMR, motivating the PTAS and SUM construction. Overall, the framework enables strategy-robust, context-aware online pricing with provable performance guarantees and broad applicability to online mechanism design under strategic behavior.

Abstract

Learning effective pricing strategies is crucial in digital marketplaces, especially when buyers' valuations are unknown and must be inferred through interaction. We study the online contextual pricing problem, where a seller observes a stream of context-valuation pairs and dynamically sets prices. Moreover, departing from traditional online learning frameworks, we consider a strategic setting in which buyers may misreport valuations to influence future prices, a challenge known as strategic overfitting (Amin et al., 2013). We introduce a strategy-robust notion of regret for multi-buyer online environments, capturing worst-case strategic behavior in the spirit of the Price of Anarchy. Our first contribution is a polynomial-time approximation scheme (PTAS) for learning linear pricing policies in adversarial, adaptive environments, enabled by a novel online sketching technique. Building on this result, we propose our main construction: the Sparse Update Mechanism (SUM), a simple yet effective sequential mechanism that ensures robustness to all Nash equilibria among buyers. Moreover, our construction yields a black-box reduction from online expert algorithms to strategy-robust learners.

Strategy-robust Online Learning in Contextual Pricing

TL;DR

This work studies online contextual pricing when buyers’ valuations are unknown and can be manipulated strategically. It introduces a strategy-robust regret framework—a Price-of-Anarchy style metric—for multi-buyer online pricing, and provides a PTAS for truthful buyers via a novel online sketching approach that reduces the action space to a compact set of sketches. To achieve robustness to Nash equilibria among strategic buyers, the Sparse Update Mechanism (SUM) randomly limits learning updates and occasionally prices randomly, yielding a bound on strategy-robust regret and enabling a black-box reduction from any no-regret online expert algorithm. A key theoretical contribution is showing computational hardness for sublinear regret in OMR, motivating the PTAS and SUM construction. Overall, the framework enables strategy-robust, context-aware online pricing with provable performance guarantees and broad applicability to online mechanism design under strategic behavior.

Abstract

Learning effective pricing strategies is crucial in digital marketplaces, especially when buyers' valuations are unknown and must be inferred through interaction. We study the online contextual pricing problem, where a seller observes a stream of context-valuation pairs and dynamically sets prices. Moreover, departing from traditional online learning frameworks, we consider a strategic setting in which buyers may misreport valuations to influence future prices, a challenge known as strategic overfitting (Amin et al., 2013). We introduce a strategy-robust notion of regret for multi-buyer online environments, capturing worst-case strategic behavior in the spirit of the Price of Anarchy. Our first contribution is a polynomial-time approximation scheme (PTAS) for learning linear pricing policies in adversarial, adaptive environments, enabled by a novel online sketching technique. Building on this result, we propose our main construction: the Sparse Update Mechanism (SUM), a simple yet effective sequential mechanism that ensures robustness to all Nash equilibria among buyers. Moreover, our construction yields a black-box reduction from online expert algorithms to strategy-robust learners.

Paper Structure

This paper contains 43 sections, 20 theorems, 88 equations.

Key Result

Theorem 1

[Adapted from liu2018learning] Suppose the Exponential Time Hypothesis impagliazzo2001complexity holds. Then, no algorithm $\mathcal{A}$ for the seller in OMR can attain sublinear regret, i.e., $\sup_{\mathcal{E}, \mathcal{T}} \mathsf{Reg}(\mathcal{A}; \mathcal{E}, \mathcal{T}) \in \mathcal{O}(T^z)$

Theorems & Definitions (30)

  • Theorem 1
  • definition 1
  • Theorem 2
  • Lemma 2
  • Lemma 2
  • Theorem 3
  • Lemma 3
  • Lemma 3
  • Lemma 3
  • Proposition 1: Hardness of Myersonian Regression liu2020myersonian
  • ...and 20 more