Table of Contents
Fetching ...

Contextual Online Bilateral Trade

Romain Cosson, Federico Fusco, Anupam Gupta, Stefano Leonardi, Renato Paes Leme, Matteo Russo

TL;DR

It is shown that the tight two-bit regret regimes are still attainable, at the cost of allowing the learner to possibly incur a small negative profit of order $O(d\log d)$, which is notably independent of the time horizon.

Abstract

We study repeated bilateral trade when the valuations of the sellers and the buyers are contextual. More precisely, the agents' valuations are given by the inner product of a context vector with two unknown $d$-dimensional vectors -- one for the buyers and one for the sellers. At each time step $t$, the learner receives a context and posts two prices, one for the seller and one for the buyer, and the trade happens if both agents accept their price. We study two objectives for this problem, gain from trade and profit, proving no-regret with respect to a surprisingly strong benchmark: the best omniscient dynamic strategy. In the natural scenario where the learner observes \emph{separately} whether the agents accept their price -- the so-called \emph{two-bit} feedback -- we design algorithms that achieve $O(d\log d)$ regret for gain from trade, and $O(d \log\log T + d\log d)$ regret for profit maximization. Both results are tight, up to the $\log(d)$ factor, and implement per-step budget balance, meaning that the learner never incurs negative profit. In the less informative \emph{one-bit} feedback model, the learner only observes whether a trade happens or not. For this scenario, we show that the tight two-bit regret regimes are still attainable, at the cost of allowing the learner to possibly incur a small negative profit of order $O(d\log d)$, which is notably independent of the time horizon. As a final set of results, we investigate the combination of one-bit feedback and per-step budget balance. There, we design an algorithm for gain from trade that suffers regret independent of the time horizon, but \emph{exponential} in the dimension $d$. For profit maximization, we maintain this exponential dependence on the dimension, which gets multiplied by a $\log T$ factor.

Contextual Online Bilateral Trade

TL;DR

It is shown that the tight two-bit regret regimes are still attainable, at the cost of allowing the learner to possibly incur a small negative profit of order , which is notably independent of the time horizon.

Abstract

We study repeated bilateral trade when the valuations of the sellers and the buyers are contextual. More precisely, the agents' valuations are given by the inner product of a context vector with two unknown -dimensional vectors -- one for the buyers and one for the sellers. At each time step , the learner receives a context and posts two prices, one for the seller and one for the buyer, and the trade happens if both agents accept their price. We study two objectives for this problem, gain from trade and profit, proving no-regret with respect to a surprisingly strong benchmark: the best omniscient dynamic strategy. In the natural scenario where the learner observes \emph{separately} whether the agents accept their price -- the so-called \emph{two-bit} feedback -- we design algorithms that achieve regret for gain from trade, and regret for profit maximization. Both results are tight, up to the factor, and implement per-step budget balance, meaning that the learner never incurs negative profit. In the less informative \emph{one-bit} feedback model, the learner only observes whether a trade happens or not. For this scenario, we show that the tight two-bit regret regimes are still attainable, at the cost of allowing the learner to possibly incur a small negative profit of order , which is notably independent of the time horizon. As a final set of results, we investigate the combination of one-bit feedback and per-step budget balance. There, we design an algorithm for gain from trade that suffers regret independent of the time horizon, but \emph{exponential} in the dimension . For profit maximization, we maintain this exponential dependence on the dimension, which gets multiplied by a factor.
Paper Structure (44 sections, 20 theorems, 72 equations, 7 figures, 1 table, 4 algorithms)

This paper contains 44 sections, 20 theorems, 72 equations, 7 figures, 1 table, 4 algorithms.

Key Result

Lemma 4.0

If the algorithm posts a balanced price for the seller, then If instead it posts a balanced price for the buyer, then

Figures (7)

  • Figure 1: Illustration of seller and buyer conference regions and projections.
  • Figure 2: Visualization of the context-free dyadic search. $x$-axis corresponds to seller, $y$-axis to buyer.
  • Figure 3: Illustration of the quadratic search step: In phase $i$, we operate within a square $Q^i$ of side $\varepsilon_i$. (a) We fix the buyer's price to $b^i$, then test decreasing seller prices starting from $s^i$. The vertical dashed line indicates the first refused seller price. (b) We then fix the seller's price to $s^i$, then test increasing buyer prices starting from $b^i$. The horizontal dashed line indicates the first refused buyer's price. These two tests identify a new square $Q^{i+1}$ of side $\varepsilon_{i+1}$, where the algorithm is called recursively.
  • Figure 4: Visualization of the efficiency maximization algorithms.
  • Figure 5: Visualization of the profit-maximizing algorithm with one-bit feedback and per-round budget balance.
  • ...and 2 more figures

Theorems & Definitions (36)

  • Lemma 4.0: Balanced Prices
  • Theorem 4.1
  • proof
  • Proposition 4.1
  • Theorem 4.2
  • proof
  • Lemma 4.2
  • Lemma 4.3: Weak overlap case
  • proof
  • Lemma 4.4: Strong overlap case
  • ...and 26 more