Algorithmic collusion in a two-sided market: A rideshare example

Pravesh Koirala; Forrest Laine

Algorithmic collusion in a two-sided market: A rideshare example

Pravesh Koirala, Forrest Laine

TL;DR

The results indicate that PPO can either converge to a competitive or a collusive equilibrium depending upon the underlying market characteristics, even when the hyper-parameters are held constant.

Abstract

With dynamic pricing on the rise, firms are using sophisticated algorithms for price determination. These algorithms are often non-interpretable and there has been a recent interest in their seemingly emergent ability to tacitly collude with each other without any prior communication whatsoever. Most of the previous works investigate algorithmic collusion on simple reinforcement learning (RL) based algorithms operating on a basic market model. Instead, we explore the collusive tendencies of Proximal Policy Optimization (PPO), a state-of-the-art continuous state/action space RL algorithm, on a complex double-sided hierarchical market model of rideshare. For this purpose, we extend a mathematical program network (MPN) based rideshare model to a temporal multi origin-destination setting and use PPO to solve for a repeated duopoly game. Our results indicate that PPO can either converge to a competitive or a collusive equilibrium depending upon the underlying market characteristics, even when the hyper-parameters are held constant.

Algorithmic collusion in a two-sided market: A rideshare example

TL;DR

The results indicate that PPO can either converge to a competitive or a collusive equilibrium depending upon the underlying market characteristics, even when the hyper-parameters are held constant.

Abstract

Paper Structure (13 sections, 12 equations, 6 figures)

This paper contains 13 sections, 12 equations, 6 figures.

INTRODUCTION
Literature Review
Model
Extension to temporal multi origin-destination (OD) graphs
Temporal multi OD mathematical program network
Optimum Response and Optimization Objective
Responsive supply market
Lagging supply market
Optimization Objective
Proximal Policy Optimization (PPO) for Multi-Agent Reinforcement Learning
Experiment
Results and Discussion
Conclusion

Figures (6)

Figure 1: MPN based model for the ridesharing duopoly problem proposed by koirala2023decreasing. Platforms U and L decide wages and rates simultaneously for the drivers (D) and the passengers (P). The drivers decide which platform to work for followed by the passengers deciding which platform to use.
Figure 2: A graph with multiple origin-destination. Nodes in this graph represent a location whereas the edges indicate routes. Edge weights represent the proportions of passengers wanting to move to corresponding nodes. The distance of these routes is given by the $\mathbb{D}$ matrix.
Figure 3: eMPN for temporal multi-OD graphs. This is an extension of MPN depicted by figure \ref{['fig:MPN']}. There are $N$ nodes for drivers and $N^2-N$ nodes for passengers.
Figure 4: Exponential Moving Average (EMA) smoothed instantaneous profits at the end of each episode in both cases. The profits converge as the models better learn the market response. In the responsive supply case, competition drives down the profits for both platforms but in the lagging supply case, algorithms learn to extract consistent profits by colluding.
Figure 5: EMA smoothed rates and commissions for a responsive market. The competition on the supply side drives the commission towards the rate reducing platform profits.
...and 1 more figures

Algorithmic collusion in a two-sided market: A rideshare example

TL;DR

Abstract

Algorithmic collusion in a two-sided market: A rideshare example

Authors

TL;DR

Abstract

Table of Contents

Figures (6)