Table of Contents
Fetching ...

Detecting Toxic Flow

Álvaro Cartea, Gerardo Duran-Martin, Leandro Sánchez-Betancourt

TL;DR

This work tackles the problem of predicting toxic FX trades at the granularity of individual trades within broker–client flow.It introduces PULSE, an online Bayesian neural-network training method that decomposes hidden-layer parameters into a fixed subspace and a last layer, enabling rapid updates and uncertainty quantification.Empirically, PULSE outperforms logistic regression, random forests, and a recursively updated MLE in predicting toxic trades and yields higher broker PnL when combined with an internalise/externalise strategy.The results also show that a universal model with client features generally surpasses per-client models and that the approach scales to real-time deployment with meaningful practical impact for toxicity management in FX markets.

Abstract

This paper develops a framework to predict toxic trades that a broker receives from her clients. Toxic trades are predicted with a novel online learning Bayesian method which we call the projection-based unification of last-layer and subspace estimation (PULSE). PULSE is a fast and statistically-efficient Bayesian procedure for online training of neural networks. We employ a proprietary dataset of foreign exchange transactions to test our methodology. Neural networks trained with PULSE outperform standard machine learning and statistical methods when predicting if a trade will be toxic; the benchmark methods are logistic regression, random forests, and a recursively-updated maximum-likelihood estimator. We devise a strategy for the broker who uses toxicity predictions to internalise or to externalise each trade received from her clients. Our methodology can be implemented in real-time because it takes less than one millisecond to update parameters and make a prediction. Compared with the benchmarks, online learning of a neural network with PULSE attains the highest PnL and avoids the most losses by externalising toxic trades.

Detecting Toxic Flow

TL;DR

This work tackles the problem of predicting toxic FX trades at the granularity of individual trades within broker–client flow.It introduces PULSE, an online Bayesian neural-network training method that decomposes hidden-layer parameters into a fixed subspace and a last layer, enabling rapid updates and uncertainty quantification.Empirically, PULSE outperforms logistic regression, random forests, and a recursively updated MLE in predicting toxic trades and yields higher broker PnL when combined with an internalise/externalise strategy.The results also show that a universal model with client features generally surpasses per-client models and that the approach scales to real-time deployment with meaningful practical impact for toxicity management in FX markets.

Abstract

This paper develops a framework to predict toxic trades that a broker receives from her clients. Toxic trades are predicted with a novel online learning Bayesian method which we call the projection-based unification of last-layer and subspace estimation (PULSE). PULSE is a fast and statistically-efficient Bayesian procedure for online training of neural networks. We employ a proprietary dataset of foreign exchange transactions to test our methodology. Neural networks trained with PULSE outperform standard machine learning and statistical methods when predicting if a trade will be toxic; the benchmark methods are logistic regression, random forests, and a recursively-updated maximum-likelihood estimator. We devise a strategy for the broker who uses toxicity predictions to internalise or to externalise each trade received from her clients. Our methodology can be implemented in real-time because it takes less than one millisecond to update parameters and make a prediction. Compared with the benchmarks, online learning of a neural network with PULSE attains the highest PnL and avoids the most losses by externalising toxic trades.
Paper Structure (28 sections, 4 theorems, 66 equations, 22 figures, 3 tables, 1 algorithm)

This paper contains 28 sections, 4 theorems, 66 equations, 22 figures, 3 tables, 1 algorithm.

Key Result

Theorem 2

Suppose $\log p(y_{n} \,\vert\, {\bf z}, {\bf w}; {\bf x}_{n})$ is differentiable with respect to $({\bf z}, {\bf w})$ and the observations $\{y_{n}\}_{n=1}^N$ are conditionally independent over $({\bf z}, {\bf w})$. Write the mean of the target variable $y_{n}$ as a first-order approximation of the where ${\boldsymbol{\mu}}_{n}$, ${\boldsymbol{\Gamma}}_{n}$ are the estimated mean and covariance o

Figures (22)

  • Figure 1: Relationship of PULSE to other models.
  • Figure 2: A client's sell trade that becomes toxic for the broker after a few seconds of filling the trade. The $x$-axis is time and the $y$-axis is in units of USD.
  • Figure 3: Profitability in dollars per million euros traded after a trade is executed. Panels correspond to two different clients. Blue line is the median trajectory and grey region is the 90% trajectory region. The $x$-axis is time and the $y$-axis is the profitability from the point of view of the client.
  • Figure 4: PULSE architecture for an MLP. The MLP is parameterised by $\bm\theta = ({\boldsymbol\psi}, {\bf w})$, where ${\boldsymbol\psi}$ are the parameters in the hidden layers and ${\bf w}$ are the parameters in the last layer.
  • Figure 5: Warmup and deployment stages. We use all data available from $t_0$ to $t_\text{warmup}$ to estimate $\bf A$ and $\bf b$. At $t_\text{warmup}$, we initialise the variational approximations $\phi_0({\bf w})$ and $\varphi_0({\bf z})$. Finally, for $t > t_\text{warmup}$, we estimate ${\bf w}_t$ and ${\bf z}_t$.
  • ...and 17 more figures

Theorems & Definitions (12)

  • Definition 1: Toxic trade
  • Theorem 2: PULSE
  • Definition 3: $\mathfrak{p}$-predicted toxic trade
  • Definition 4
  • Definition 5
  • Proposition 1
  • proof
  • Corollary 1
  • proof
  • Proposition 2
  • ...and 2 more