Table of Contents
Fetching ...

Online Optimization Algorithms in Repeated Price Competition: Equilibrium Learning and Algorithmic Collusion

Martin Bichler, Julius Durmann, Matthias Oberlechner

TL;DR

The paper investigates whether online pricing algorithms used in repeated Bertrand competition converge to competitive Nash outcomes or enable tacit collusion. It establishes that mean-based bandit algorithms converge to the correlated rationalizable set, which coincides with Nash equilibria in important Bertrand settings with all-or-nothing or linear demand, yielding last-iterate convergence to NE under certain conditions. Complementary experiments show that algorithmic collusion is rare in practice, occurring mainly in symmetric installations of UCB or Q-learning with few firms, and it weakens as the number of competitors increases or when heterogeneous algorithms interact. The results imply that the risk of widespread algorithm-driven collusion may be overstated in realistic pricing environments and provide regulators and managers with nuanced guidance on monitoring and tool deployment.

Abstract

This paper examines whether widely used online learning algorithms in pricing can independently reach competitive outcomes or instead foster tacit collusion. This issue has drawn considerable attention from competition regulators as algorithmic pricing becomes more common in digital markets. Understanding when such algorithms lead to equilibrium prices or to supra-competitive prices is critical for buyers, sellers, and policymakers. We study the behavior of multi-armed bandit algorithms in repeated price competition. These algorithms only observe profits from the chosen prices, making them realistic models of automated pricing. Our formal analysis shows that an important class of online learning algorithms, called mean-based algorithms, reliably converges to Nash equilibrium in Bertrand competition. This finding is notable because, generally, online learning algorithms do not guarantee convergence. We also run extensive numerical experiments with different bandit algorithms, confirming that most widely used algorithms, including those not mean-based, converge to equilibrium. We observe supra-competitive prices only in specific cases where all sellers implement the same symmetric version of certain algorithms, such as UCB or Q-learning, and this effect diminishes as the number of competitors increases. Our results highlight that the risk of algorithmic collusion in competitive markets is often overstated. For most practical implementations of bandit algorithms, sellers' prices converge to competitive levels. Only under very specific and symmetric setups do prices remain above competitive benchmarks, and this effect diminishes with more competitors. These insights support regulators concerned with consumer welfare and managers considering algorithmic pricing tools. They suggest that while vigilance is warranted, fears of widespread algorithm-driven collusion may be exaggerated.

Online Optimization Algorithms in Repeated Price Competition: Equilibrium Learning and Algorithmic Collusion

TL;DR

The paper investigates whether online pricing algorithms used in repeated Bertrand competition converge to competitive Nash outcomes or enable tacit collusion. It establishes that mean-based bandit algorithms converge to the correlated rationalizable set, which coincides with Nash equilibria in important Bertrand settings with all-or-nothing or linear demand, yielding last-iterate convergence to NE under certain conditions. Complementary experiments show that algorithmic collusion is rare in practice, occurring mainly in symmetric installations of UCB or Q-learning with few firms, and it weakens as the number of competitors increases or when heterogeneous algorithms interact. The results imply that the risk of widespread algorithm-driven collusion may be overstated in realistic pricing environments and provide regulators and managers with nuanced guidance on monitoring and tool deployment.

Abstract

This paper examines whether widely used online learning algorithms in pricing can independently reach competitive outcomes or instead foster tacit collusion. This issue has drawn considerable attention from competition regulators as algorithmic pricing becomes more common in digital markets. Understanding when such algorithms lead to equilibrium prices or to supra-competitive prices is critical for buyers, sellers, and policymakers. We study the behavior of multi-armed bandit algorithms in repeated price competition. These algorithms only observe profits from the chosen prices, making them realistic models of automated pricing. Our formal analysis shows that an important class of online learning algorithms, called mean-based algorithms, reliably converges to Nash equilibrium in Bertrand competition. This finding is notable because, generally, online learning algorithms do not guarantee convergence. We also run extensive numerical experiments with different bandit algorithms, confirming that most widely used algorithms, including those not mean-based, converge to equilibrium. We observe supra-competitive prices only in specific cases where all sellers implement the same symmetric version of certain algorithms, such as UCB or Q-learning, and this effect diminishes as the number of competitors increases. Our results highlight that the risk of algorithmic collusion in competitive markets is often overstated. For most practical implementations of bandit algorithms, sellers' prices converge to competitive levels. Only under very specific and symmetric setups do prices remain above competitive benchmarks, and this effect diminishes with more competitors. These insights support regulators concerned with consumer welfare and managers considering algorithmic pricing tools. They suggest that while vigilance is warranted, fears of widespread algorithm-driven collusion may be exaggerated.

Paper Structure

This paper contains 51 sections, 22 theorems, 69 equations, 15 figures, 2 tables, 2 algorithms.

Key Result

Proposition 1

Given a finite normal-formal game, any action profile that is in the support of a correlated equilibrium is also in the correlated rationalizable set.

Figures (15)

  • Figure 1: Coarse Correlated Equilibria for Different Bertrand Competitions
  • Figure 2: Relation Between Solution Concepts
  • Figure 3: Evolution of Prices During Training for Mean-based Algorithms
  • Figure 4: Collusion in Duopolies
  • Figure 5: Evolution of Prices During Training
  • ...and 10 more figures

Theorems & Definitions (58)

  • Definition 1: No-regret Algorithm
  • Definition 2: (Finite) Normal-form Game
  • Definition 3: Nash Equilibrium (NE)
  • Definition 4: Potential Game
  • Definition 5: Supermodular Game
  • Definition 6: Correlated Equilibrium
  • Definition 7: Coarse Correlated Equilibrium
  • Example 1: Coarse Correlated Equilibria in Bertrand Competitions
  • Definition 8: Dominated Actions
  • Definition 9: Correlated Rationalizable Set (CR)
  • ...and 48 more