Table of Contents
Fetching ...

Learning-Based Pricing and Matching for Two-Sided Queues

Zixian Yang, Lei Ying

TL;DR

A learning-based pricing algorithm is proposed, which combines gradient-free stochastic projected gradient ascent with bisection search, that yields a sublinear regret and queue-length bound and establishes a tradeoff between the regret bound and the queue-length bound.

Abstract

We consider a dynamic system with multiple types of customers and servers. Each type of waiting customer or server joins a separate queue, forming a bipartite graph with customer-side queues and server-side queues. The platform can match the servers and customers if their types are compatible. The matched pairs then leave the system. The platform will charge a customer a price according to their type when they arrive and will pay a server a price according to their type. The arrival rate of each queue is determined by the price according to some unknown demand or supply functions. Our goal is to design pricing and matching algorithms to maximize the profit of the platform with unknown demand and supply functions, while keeping queue lengths of both customers and servers below a predetermined threshold. This system can be used to model two-sided markets such as ride-sharing markets with passengers and drivers. The difficulties of the problem include simultaneous learning and decision making, and the tradeoff between maximizing profit and minimizing queue length. We use a longest-queue-first matching algorithm and propose a learning-based pricing algorithm, which combines gradient-free stochastic projected gradient ascent with bisection search. We prove that our proposed algorithm yields a sublinear regret $\tilde{O}(T^{5/6})$ and anytime queue-length bound $\tilde{O}(T^{1/6})$, where $T$ is the time horizon. We further establish a tradeoff between the regret bound and the queue-length bound: $\tilde{O}(T^{1-γ})$ versus $\tilde{O}(T^γ)$ for $γ\in (0, 1/6].$

Learning-Based Pricing and Matching for Two-Sided Queues

TL;DR

A learning-based pricing algorithm is proposed, which combines gradient-free stochastic projected gradient ascent with bisection search, that yields a sublinear regret and queue-length bound and establishes a tradeoff between the regret bound and the queue-length bound.

Abstract

We consider a dynamic system with multiple types of customers and servers. Each type of waiting customer or server joins a separate queue, forming a bipartite graph with customer-side queues and server-side queues. The platform can match the servers and customers if their types are compatible. The matched pairs then leave the system. The platform will charge a customer a price according to their type when they arrive and will pay a server a price according to their type. The arrival rate of each queue is determined by the price according to some unknown demand or supply functions. Our goal is to design pricing and matching algorithms to maximize the profit of the platform with unknown demand and supply functions, while keeping queue lengths of both customers and servers below a predetermined threshold. This system can be used to model two-sided markets such as ride-sharing markets with passengers and drivers. The difficulties of the problem include simultaneous learning and decision making, and the tradeoff between maximizing profit and minimizing queue length. We use a longest-queue-first matching algorithm and propose a learning-based pricing algorithm, which combines gradient-free stochastic projected gradient ascent with bisection search. We prove that our proposed algorithm yields a sublinear regret and anytime queue-length bound , where is the time horizon. We further establish a tradeoff between the regret bound and the queue-length bound: versus for
Paper Structure (33 sections, 16 theorems, 259 equations, 8 figures, 3 algorithms)

This paper contains 33 sections, 16 theorems, 259 equations, 8 figures, 3 algorithms.

Key Result

Proposition 1

Let Assumption assum:1 and Assumption assum:2 hold. Consider policies such that the limits $\lim_{T\rightarrow \infty} \frac{1}{T} \sum_{t=1}^{T} \mathbb{E}[X_{i,j}(t)]$, $\lim_{T\rightarrow \infty} \frac{1}{T} \sum_{t=1}^{T} \mathbb{E}[ \lambda_i(t)]$, and $\lim_{T\rightarrow \infty} \frac{1}{T} \s

Figures (8)

  • Figure 1: The model, an example with 3 types of customers and 2 types of servers.
  • Figure 2: The timeline in each time slot.
  • Figure 3: Tradeoff between regret and queue length.
  • Figure 4: An example of longest-queue-first matching algorithm. At each time slot, for each queue, if there is a new arrival, it will be matched with one server (or customer) in the longest compatible queue on the other side. Then the matched pairs leave the system and we move on to check the next queue and repeat the same process.
  • Figure 5: Pricing algorithm, an example with two customer queues connected to one server queue. Let $x_1$ and $x_2$ denote the arrival rates of the two customer queues, respectively. The arrival rate of the server queue should be $x_1+x_2$ because of the balance constraint. Let $\boldsymbol{x}(k)\coloneqq (x_1(k), x_2(k))$, where $k$ denote the iteration.
  • ...and 3 more figures

Theorems & Definitions (16)

  • Proposition 1
  • Theorem 1
  • Corollary 1
  • Corollary 2
  • Corollary 3
  • Theorem 2
  • Corollary 4
  • Corollary 5
  • Lemma 1
  • Lemma 2
  • ...and 6 more