Table of Contents
Fetching ...

Algorithmic collusion in a two-sided market: A rideshare example

Pravesh Koirala, Forrest Laine

TL;DR

The results indicate that PPO can either converge to a competitive or a collusive equilibrium depending upon the underlying market characteristics, even when the hyper-parameters are held constant.

Abstract

With dynamic pricing on the rise, firms are using sophisticated algorithms for price determination. These algorithms are often non-interpretable and there has been a recent interest in their seemingly emergent ability to tacitly collude with each other without any prior communication whatsoever. Most of the previous works investigate algorithmic collusion on simple reinforcement learning (RL) based algorithms operating on a basic market model. Instead, we explore the collusive tendencies of Proximal Policy Optimization (PPO), a state-of-the-art continuous state/action space RL algorithm, on a complex double-sided hierarchical market model of rideshare. For this purpose, we extend a mathematical program network (MPN) based rideshare model to a temporal multi origin-destination setting and use PPO to solve for a repeated duopoly game. Our results indicate that PPO can either converge to a competitive or a collusive equilibrium depending upon the underlying market characteristics, even when the hyper-parameters are held constant.

Algorithmic collusion in a two-sided market: A rideshare example

TL;DR

The results indicate that PPO can either converge to a competitive or a collusive equilibrium depending upon the underlying market characteristics, even when the hyper-parameters are held constant.

Abstract

With dynamic pricing on the rise, firms are using sophisticated algorithms for price determination. These algorithms are often non-interpretable and there has been a recent interest in their seemingly emergent ability to tacitly collude with each other without any prior communication whatsoever. Most of the previous works investigate algorithmic collusion on simple reinforcement learning (RL) based algorithms operating on a basic market model. Instead, we explore the collusive tendencies of Proximal Policy Optimization (PPO), a state-of-the-art continuous state/action space RL algorithm, on a complex double-sided hierarchical market model of rideshare. For this purpose, we extend a mathematical program network (MPN) based rideshare model to a temporal multi origin-destination setting and use PPO to solve for a repeated duopoly game. Our results indicate that PPO can either converge to a competitive or a collusive equilibrium depending upon the underlying market characteristics, even when the hyper-parameters are held constant.
Paper Structure (13 sections, 12 equations, 6 figures)

This paper contains 13 sections, 12 equations, 6 figures.

Figures (6)

  • Figure 1: MPN based model for the ridesharing duopoly problem proposed by koirala2023decreasing. Platforms U and L decide wages and rates simultaneously for the drivers (D) and the passengers (P). The drivers decide which platform to work for followed by the passengers deciding which platform to use.
  • Figure 2: A graph with multiple origin-destination. Nodes in this graph represent a location whereas the edges indicate routes. Edge weights represent the proportions of passengers wanting to move to corresponding nodes. The distance of these routes is given by the $\mathbb{D}$ matrix.
  • Figure 3: eMPN for temporal multi-OD graphs. This is an extension of MPN depicted by figure \ref{['fig:MPN']}. There are $N$ nodes for drivers and $N^2-N$ nodes for passengers.
  • Figure 4: Exponential Moving Average (EMA) smoothed instantaneous profits at the end of each episode in both cases. The profits converge as the models better learn the market response. In the responsive supply case, competition drives down the profits for both platforms but in the lagging supply case, algorithms learn to extract consistent profits by colluding.
  • Figure 5: EMA smoothed rates and commissions for a responsive market. The competition on the supply side drives the commission towards the rate reducing platform profits.
  • ...and 1 more figures