A Stronger Benchmark for Online Bilateral Trade: From Fixed Prices to Distributions

Anna Lunghi; Mattia Piccinato; Matteo Castiglioni; Alberto Marchesi

A Stronger Benchmark for Online Bilateral Trade: From Fixed Prices to Distributions

Anna Lunghi, Mattia Piccinato, Matteo Castiglioni, Alberto Marchesi

TL;DR

This work studies online bilateral trade under a Global Budget Balance (GBB) constraint with one-bit feedback in a stochastic environment. It proves that, under a bounded-density joint valuation distribution, one can achieve sublinear regret $\tilde{O}(T^{3/4})$ against the best GBB-fixed distribution over price pairs, matching the known lower bound and closing the gap with the weaker fixed-price benchmark. The authors introduce a three-phase algorithm: profit collection, a two-dimensional grid-based pure exploration that reuses samples efficiently, and an optimistic constrained-bandit optimization for GFT. A key technical contribution is showing that bounded density enables grid discretization without incurring prohibitive loss, and that exploration over $K^2$ price pairs can be accomplished with only $2K$ effectively coupled estimations. This advances the understanding of learnability under global budget constraints and demonstrates no separation between learning a one-dimensional WBB price and the two-dimensional GBB distribution, with potential implications for mechanism design in repeated trade settings.

Abstract

We study online bilateral trade, where a learner facilitates repeated exchanges between a buyer and a seller to maximize the Gain From Trade (GFT), i.e., the social welfare. In doing so, the learner must guarantee not to subsidize the market. This constraint is usually imposed per round through Weak Budget Balance (WBB). Despite that, Bernasconi et al. [2024] show that a Global Budget Balance (GBB) constraint on the profit -- enforced over the entire time horizon -- can improve the GFT by a multiplicative factor of two. While this might appear to be a marginal relaxation, this implies that all existing WBB-focused algorithms suffer linear regret when measured against the GBB optimum. In this work, we provide the first algorithm to achieve sublinear regret against the GBB benchmark in stochastic environments under one-bit feedback. In particular, we show that when the joint distribution of valuations has a bounded density, our algorithm achieves $\widetilde{\mathcal{O}}(T^{3/4})$ regret. Our result shows that there is no separation between the one-dimensional problem of learning the optimal WBB price and the two-dimensional problem of learning the optimal GBB distribution over pairs of prices.

A Stronger Benchmark for Online Bilateral Trade: From Fixed Prices to Distributions

TL;DR

against the best GBB-fixed distribution over price pairs, matching the known lower bound and closing the gap with the weaker fixed-price benchmark. The authors introduce a three-phase algorithm: profit collection, a two-dimensional grid-based pure exploration that reuses samples efficiently, and an optimistic constrained-bandit optimization for GFT. A key technical contribution is showing that bounded density enables grid discretization without incurring prohibitive loss, and that exploration over

price pairs can be accomplished with only

effectively coupled estimations. This advances the understanding of learnability under global budget constraints and demonstrates no separation between learning a one-dimensional WBB price and the two-dimensional GBB distribution, with potential implications for mechanism design in repeated trade settings.

Abstract

regret. Our result shows that there is no separation between the one-dimensional problem of learning the optimal WBB price and the two-dimensional problem of learning the optimal GBB distribution over pairs of prices.

Paper Structure (27 sections, 17 theorems, 85 equations, 1 figure, 3 algorithms)

This paper contains 27 sections, 17 theorems, 85 equations, 1 figure, 3 algorithms.

Introduction
Our Results and Techniques
GGB Algorithm Vs. GBB Baseline
Additional Related Works
Preliminaries
Learning Protocol
Budget Balance
Feedback Model
Regret Against the Best GBB Fixed Distribution
Problem Formulation Recap
Why We Need Bounded Density
Construction of the Hard Instances
Main Theorem and Proof Plan
Under Bounded Density a Grid is Enough!
High-Level Construction of the Algorithm
...and 12 more sections

Key Result

Theorem 3.1

For any $T\in \mathbb{N}$ and any learning algorithm, there is an instance such that $\mathbb{E}[R_T] \ge \Omega(T).$

Figures (1)

Figure 1: Top: Support of the probability distribution $\mathcal{D}_\epsilon$. Bottom: Expected profit (left) and GFT (right) under $\mathcal{D}_\epsilon$.

Theorems & Definitions (27)

Theorem 3.1
Theorem 4.1
Lemma 5.0
Lemma 5.0
proof : Proof Sketch
Lemma 7.0
proof : Proof Sketch
Lemma 7.1
Lemma 8.0
Lemma 8.0
...and 17 more

A Stronger Benchmark for Online Bilateral Trade: From Fixed Prices to Distributions

TL;DR

Abstract

A Stronger Benchmark for Online Bilateral Trade: From Fixed Prices to Distributions

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (1)

Theorems & Definitions (27)