Table of Contents
Fetching ...

Algorithmic Predation: Equilibrium Analysis in Dynamic Oligopolies with Smooth Market Sharing

Fabian Raoul Pieroth, Ole Petersen, Martin Bichler

TL;DR

This study combines deep reinforcement learning with equilibrium verification to analyze predatory pricing in finite-horizon dynamic oligopolies that allow firm dropouts. It extends Selten's smooth-sharing framework by incorporating dropout dynamics and imperfect information, deriving an analytical NE without dropouts and demonstrating that predatory equilibria arise when costs are asymmetric. The DRL-based approach shows consistent convergence to approximate Nash equilibria and reveals robust predatory pricing, which can, in some cases, improve total welfare by efficiently restructuring the market. The findings offer nuanced implications for competition policy, suggesting that timing, cost structures, and exit dynamics can yield welfare outcomes that diverge from traditional anti-trust intuitions.

Abstract

Predatory pricing -- where a firm strategically lowers prices to undermine competitors -- is a contentious topic in dynamic oligopoly theory, with scholars debating practical relevance and the existence of predatory equilibria. Although finite-horizon dynamic models have long been proposed to capture the strategic intertemporal incentives of oligopolists, the existence and form of equilibrium strategies in settings that allow for firm exit (drop-outs following loss-making periods) have remained an open question. We focus on the seminal dynamic oligopoly model by Selten (1965) that introduces the subgame perfect equilibrium and analyzes smooth market sharing. Equilibrium can be derived analytically in models that do not allow for dropouts, but not in models that can lead to predatory pricing. In this paper, we leverage recent advances in deep reinforcement learning to compute and verify equilibria in finite-horizon dynamic oligopoly games. Our experiments reveal two key findings: first, state-of-the-art deep reinforcement learning algorithms reliably converge to equilibrium in both perfect- and imperfect-information oligopoly models; second, when firms face asymmetric cost structures, the resulting equilibria exhibit predatory pricing behavior. These results demonstrate that predatory pricing can emerge as a rational equilibrium strategy across a broad variety of model settings. By providing equilibrium analysis of finite-horizon dynamic oligopoly models with drop-outs, our study answers a decade-old question and offers new insights for competition authorities and regulators.

Algorithmic Predation: Equilibrium Analysis in Dynamic Oligopolies with Smooth Market Sharing

TL;DR

This study combines deep reinforcement learning with equilibrium verification to analyze predatory pricing in finite-horizon dynamic oligopolies that allow firm dropouts. It extends Selten's smooth-sharing framework by incorporating dropout dynamics and imperfect information, deriving an analytical NE without dropouts and demonstrating that predatory equilibria arise when costs are asymmetric. The DRL-based approach shows consistent convergence to approximate Nash equilibria and reveals robust predatory pricing, which can, in some cases, improve total welfare by efficiently restructuring the market. The findings offer nuanced implications for competition policy, suggesting that timing, cost structures, and exit dynamics can yield welfare outcomes that diverge from traditional anti-trust intuitions.

Abstract

Predatory pricing -- where a firm strategically lowers prices to undermine competitors -- is a contentious topic in dynamic oligopoly theory, with scholars debating practical relevance and the existence of predatory equilibria. Although finite-horizon dynamic models have long been proposed to capture the strategic intertemporal incentives of oligopolists, the existence and form of equilibrium strategies in settings that allow for firm exit (drop-outs following loss-making periods) have remained an open question. We focus on the seminal dynamic oligopoly model by Selten (1965) that introduces the subgame perfect equilibrium and analyzes smooth market sharing. Equilibrium can be derived analytically in models that do not allow for dropouts, but not in models that can lead to predatory pricing. In this paper, we leverage recent advances in deep reinforcement learning to compute and verify equilibria in finite-horizon dynamic oligopoly games. Our experiments reveal two key findings: first, state-of-the-art deep reinforcement learning algorithms reliably converge to equilibrium in both perfect- and imperfect-information oligopoly models; second, when firms face asymmetric cost structures, the resulting equilibria exhibit predatory pricing behavior. These results demonstrate that predatory pricing can emerge as a rational equilibrium strategy across a broad variety of model settings. By providing equilibrium analysis of finite-horizon dynamic oligopoly models with drop-outs, our study answers a decade-old question and offers new insights for competition authorities and regulators.

Paper Structure

This paper contains 15 sections, 1 theorem, 6 equations, 3 figures, 1 table, 1 algorithm.

Key Result

Theorem 1

Consider a dynamic oligopoly model with $N$ firms, unit production costs $c_i$, initial demand $D_1^i$, and time horizon $T$. The model assumes no demand observation, i.e., $\Phi_i(s_t)=t$, and no dropouts. Then, any solution to the following system of equations constitutes a deterministic NE: where the constraints are $D_t^i\geq 0$ and $c_i \leq p_t^i < p_{\text{max}}$ for $1 \leq t \leq T$ and

Figures (3)

  • Figure 1: Strategy profile learned by PPO in the partially observable case with dropouts for specific cost scenario $c_0=0.51$ and $c_1=c_2=0.8$. Recall that with partial observability, a deterministic probabilistic strategy is fully characterized by $T$ prices. If an agent drops out in a round, the graph stops at that round.
  • Figure 2: The predatory incentives $PI_i(\pi)$ for agents $i \in \{1, 2, 3\}$ and learned strategy profiles $\pi$ over the different costs $c_0$, information structures, and algorithms. The bold line represents the mean, and the colored shaded area represents the standard deviation over five seeds. The bottom bar indicates the regime, determined by a majority vote over all algorithms, information settings, and random seeds.
  • Figure 3: The producer surplus, consumer surplus, and overall welfare ($\Delta W^{\pi}$) differences for a learned strategy profile $\pi$ and the analytical equilibrium strategies $\pi^*$ without dropout under different costs $c_0$, information structures, and algorithms. The bold line represents the mean and the shaded area the standard deviation over five seeds. The bottom bar indicates the regime, determined by a majority vote over all algorithms, information settings, and random seeds.

Theorems & Definitions (1)

  • Theorem 1