A Network Simulation of OTC Markets with Multiple Agents

James T. Wilkinson; Jacob Kelter; John Chen; Uri Wilensky

A Network Simulation of OTC Markets with Multiple Agents

James T. Wilkinson, Jacob Kelter, John Chen, Uri Wilensky

TL;DR

The paper develops an open-source agent-based model of an OTC financial market where market makers act as the sole intermediaries and agent visibility is restricted by a network topology. Value investors use static price targets, while trend investors employ a convolutional neural network paired with deep Q-learning to exploit price history, enabling dynamic interactions on a constrained network. The model reproduces key stylized facts, including fat-tailed returns with power-law tails ($P(r) \sim r^{-\alpha}$, kurtosis $>3$) and volatility clustering, and demonstrates how market structure—especially network sparsity—drives fragmentation and arbitrage opportunities between market makers. The framework shows how network topology can shape price action and market efficiency, offering a flexible, open platform for studying OTC microstructure and MARL-driven trading behaviors, with potential extensions to continuous action spaces and learned market-maker strategies.

Abstract

We present a novel agent-based approach to simulating an over-the-counter (OTC) financial market in which trades are intermediated solely by market makers and agent visibility is constrained to a network topology. Dynamics, such as changes in price, result from agent-level interactions that ubiquitously occur via market maker agents acting as liquidity providers. Two additional agents are considered: trend investors use a deep convolutional neural network paired with a deep Q-learning framework to inform trading decisions by analysing price history; and value investors use a static price-target to determine their trade directions and sizes. We demonstrate that our novel inclusion of a network topology with market makers facilitates explorations into various market structures. First, we present the model and an overview of its mechanics. Second, we validate our findings via comparison to the real-world: we demonstrate a fat-tailed distribution of price changes, auto-correlated volatility, a skew negatively correlated to market maker positioning, predictable price-history patterns and more. Finally, we demonstrate that our network-based model can lend insights into the effect of market-structure on price-action. For example, we show that markets with sparsely connected intermediaries can have a critical point of fragmentation, beyond which the market forms distinct clusters and arbitrage becomes rapidly possible between the prices of different market makers. A discussion is provided on future work that would be beneficial.

A Network Simulation of OTC Markets with Multiple Agents

TL;DR

, kurtosis

) and volatility clustering, and demonstrates how market structure—especially network sparsity—drives fragmentation and arbitrage opportunities between market makers. The framework shows how network topology can shape price action and market efficiency, offering a flexible, open platform for studying OTC microstructure and MARL-driven trading behaviors, with potential extensions to continuous action spaces and learned market-maker strategies.

Abstract

Paper Structure (12 sections, 8 equations, 17 figures, 1 algorithm)

This paper contains 12 sections, 8 equations, 17 figures, 1 algorithm.

Introduction
The Model
Network formation
Market Makers
Value investors
Reinforcement Learning Agent Mechanics
Target generation of value investors
User-defined parameters
Validation of the model
Reinforcement learning analysis
Investigations into market structure
Discussion

Figures (17)

Figure 1: An illustration of our network generation. All possible edges are considered between every agent, with every market maker. Each possible edge is formed with probability $p$, which is a user-defined parameter. Market makers are also interconnected by this same network. Every edge within the full network must be connected to at least one market maker. This aligns with our goal of having all trades performed versus an intermediary market maker.
Figure 2: Illustration of our convolutional Q-network in the decision-making process. The dimensions of the convolutional layers are presented in the format (input channels, output channels, kernel size, and stride). All convolutions are one dimensional. When prompted to act, the trend investors either choose to act randomly (with a time-decaying probability $\epsilon$), or to use their Q-network to choose the action that is estimated to maximise a target value $Q$ as defined by Bellman's equation \ref{['eq:bellman']}. This epsilon-greedy approach promotes a healthy mixture of exploration and exploitation during the agent's training.
Figure 3: Loss curves demonstrating the convergence of the trend investors' q-networks. The logarithmic y-axis shows the mean-squared error between the model's predicted profit from its trade and the profit that was realised.
Figure 4: Trend investor profit tracked over a large number of model iterations. Once trained, trend investors are employing profitable and economically viable strategies. Units for the x-axis are model iterations passed following the point at which all trend investor's exploration parameters $\epsilon$ have reached their terminal value of 0.05. As such, negative values along this axis represent time steps where the trend investors are mostly untrained, and their behaviour is mostly exploratory.
Figure 5: Price changes, measured every 50 iterations of the model once trend investor training has completed, illustrated with a fitted normal distribution. Shown both on a linear scale (upper) and a logarithmic scale (lower).
...and 12 more figures

A Network Simulation of OTC Markets with Multiple Agents

TL;DR

Abstract

A Network Simulation of OTC Markets with Multiple Agents

Authors

TL;DR

Abstract

Table of Contents

Figures (17)