Table of Contents
Fetching ...

Nonparametric Contextual Online Bilateral Trade

Emanuele Coccia, Martino Bernasconi, Andrea Celli

TL;DR

An algorithm is designed that leverages contextual information through a hierarchical tree construction and guarantees regret, and operates under two stringent features of the setting: one-bit feedback, where the learner only observes whether a trade occurred or not, and strong budget balance, where the learner cannot subsidize or profit from the market participants.

Abstract

We study the problem of contextual online bilateral trade. At each round, the learner faces a seller-buyer pair and must propose a trade price without observing their private valuations for the item being sold. The goal of the learner is to post prices to facilitate trades between the two parties. Before posting a price, the learner observes a $d$-dimensional context vector that influences the agent's valuations. Prior work in the contextual setting has focused on linear models. In this work, we tackle a general nonparametric setting in which the buyer's and seller's valuations behave according to arbitrary Lipschitz functions of the context. We design an algorithm that leverages contextual information through a hierarchical tree construction and guarantees regret $\widetilde{O}(T^{{(d-1)}/d})$. Remarkably, our algorithm operates under two stringent features of the setting: (1) one-bit feedback, where the learner only observes whether a trade occurred or not, and (2) strong budget balance, where the learner cannot subsidize or profit from the market participants. We further provide a matching lower bound in the full-feedback setting, demonstrating the tightness of our regret bound.

Nonparametric Contextual Online Bilateral Trade

TL;DR

An algorithm is designed that leverages contextual information through a hierarchical tree construction and guarantees regret, and operates under two stringent features of the setting: one-bit feedback, where the learner only observes whether a trade occurred or not, and strong budget balance, where the learner cannot subsidize or profit from the market participants.

Abstract

We study the problem of contextual online bilateral trade. At each round, the learner faces a seller-buyer pair and must propose a trade price without observing their private valuations for the item being sold. The goal of the learner is to post prices to facilitate trades between the two parties. Before posting a price, the learner observes a -dimensional context vector that influences the agent's valuations. Prior work in the contextual setting has focused on linear models. In this work, we tackle a general nonparametric setting in which the buyer's and seller's valuations behave according to arbitrary Lipschitz functions of the context. We design an algorithm that leverages contextual information through a hierarchical tree construction and guarantees regret . Remarkably, our algorithm operates under two stringent features of the setting: (1) one-bit feedback, where the learner only observes whether a trade occurred or not, and (2) strong budget balance, where the learner cannot subsidize or profit from the market participants. We further provide a matching lower bound in the full-feedback setting, demonstrating the tightness of our regret bound.
Paper Structure (31 sections, 17 theorems, 38 equations, 3 figures, 5 algorithms)

This paper contains 31 sections, 17 theorems, 38 equations, 3 figures, 5 algorithms.

Key Result

Lemma 1

Consider a node $N_{\ell,z}$ for which the Reduce procedure terminated. Then, for any $x\in \textup{Region}(N_{\ell,z})$, it holds that $f_b(x)-f_s(x)\le 6L2^{-\ell}$.

Figures (3)

  • Figure 1: The $x$-axis (resp., $y$-axis) represents the seller's (resp., buyer's) valuations. The blue (resp., yellow) region denotes valuations for which $p_L$ (resp., $p_U$) is rejected. \ref{['lem:afterRA']} states that if there are valuations both in the yellow and blue one, then there cannot be any with $\textnormal{GFT} \ge 6h=6L2^{-\ell}$. Let the green and red regions be the image of $\textup{Region}(N_{\ell,z})$ under $(f_s,f_b)$, then \ref{['lem:afterRA']} forbids the red region, but allows the green one.
  • Figure 2: Average regret (together with 95% confidence intervals) in the quadratic setting.
  • Figure 3: Comparison between average regret and theoretical bounds.

Theorems & Definitions (30)

  • Definition 1
  • Lemma 1
  • Lemma 2: Regret due to
  • Lemma 3: Regret due to
  • Theorem 1
  • Theorem 2
  • Theorem 3
  • Lemma 4: Yao's weak minimax principle
  • Theorem 4: McShane's extension mcshane1934extension
  • Lemma 5
  • ...and 20 more