Table of Contents
Fetching ...

Temporal distribution of clusters of investors and their application in prediction with expert advice

Wojciech Wisniewski, Yuri Kalnishkan, David Lindsay, Siân Lindsay

TL;DR

This work analyzes how clusters of FX traders evolve over time and demonstrates that their distributions adhere to Ewens' Sampling Distribution. It shows that Statistically Validated Networks, coupled with Infomap, reveal dynamic cluster structures, and that online prediction via the Aggregating Algorithm benefits from incorporating these clusters, though performance degrades with many similar experts. To address this, the authors propose cluster-based AA variants (CAA and ECAA) and compare SVN-infomap against hierarchical clustering; hierarchical clustering generally delivers stronger risk-adjusted returns. Overall, the paper provides a principled link between temporal clustering in trader behavior and improved online portfolio prediction, offering scalable methods to manage portfolio risk in live trading data.

Abstract

Financial organisations such as brokers face a significant challenge in servicing the investment needs of thousands of their traders worldwide. This task is further compounded since individual traders will have their own risk appetite and investment goals. Traders may look to capture short-term trends in the market which last only seconds to minutes, or they may have longer-term views which last several days to months. To reduce the complexity of this task, client trades can be clustered. By examining such clusters, we would likely observe many traders following common patterns of investment, but how do these patterns vary through time? Knowledge regarding the temporal distributions of such clusters may help financial institutions manage the overall portfolio of risk that accumulates from underlying trader positions. This study contributes to the field by demonstrating that the distribution of clusters derived from the real-world trades of 20k Foreign Exchange (FX) traders (from 2015 to 2017) is described in accordance with Ewens' Sampling Distribution. Further, we show that the Aggregating Algorithm (AA), an on-line prediction with expert advice algorithm, can be applied to the aforementioned real-world data in order to improve the returns of portfolios of trader risk. However we found that the AA 'struggles' when presented with too many trader ``experts'', especially when there are many trades with similar overall patterns. To help overcome this challenge, we have applied and compared the use of Statistically Validated Networks (SVN) with a hierarchical clustering approach on a subset of the data, demonstrating that both approaches can be used to significantly improve results of the AA in terms of profitability and smoothness of returns.

Temporal distribution of clusters of investors and their application in prediction with expert advice

TL;DR

This work analyzes how clusters of FX traders evolve over time and demonstrates that their distributions adhere to Ewens' Sampling Distribution. It shows that Statistically Validated Networks, coupled with Infomap, reveal dynamic cluster structures, and that online prediction via the Aggregating Algorithm benefits from incorporating these clusters, though performance degrades with many similar experts. To address this, the authors propose cluster-based AA variants (CAA and ECAA) and compare SVN-infomap against hierarchical clustering; hierarchical clustering generally delivers stronger risk-adjusted returns. Overall, the paper provides a principled link between temporal clustering in trader behavior and improved online portfolio prediction, offering scalable methods to manage portfolio risk in live trading data.

Abstract

Financial organisations such as brokers face a significant challenge in servicing the investment needs of thousands of their traders worldwide. This task is further compounded since individual traders will have their own risk appetite and investment goals. Traders may look to capture short-term trends in the market which last only seconds to minutes, or they may have longer-term views which last several days to months. To reduce the complexity of this task, client trades can be clustered. By examining such clusters, we would likely observe many traders following common patterns of investment, but how do these patterns vary through time? Knowledge regarding the temporal distributions of such clusters may help financial institutions manage the overall portfolio of risk that accumulates from underlying trader positions. This study contributes to the field by demonstrating that the distribution of clusters derived from the real-world trades of 20k Foreign Exchange (FX) traders (from 2015 to 2017) is described in accordance with Ewens' Sampling Distribution. Further, we show that the Aggregating Algorithm (AA), an on-line prediction with expert advice algorithm, can be applied to the aforementioned real-world data in order to improve the returns of portfolios of trader risk. However we found that the AA 'struggles' when presented with too many trader ``experts'', especially when there are many trades with similar overall patterns. To help overcome this challenge, we have applied and compared the use of Statistically Validated Networks (SVN) with a hierarchical clustering approach on a subset of the data, demonstrating that both approaches can be used to significantly improve results of the AA in terms of profitability and smoothness of returns.
Paper Structure (21 sections, 21 equations, 11 figures, 2 tables)

This paper contains 21 sections, 21 equations, 11 figures, 2 tables.

Figures (11)

  • Figure 1: Evolution of some statistics (number of clusters, number of links, number of traders in clusters vs active traders ratio, number of clusters vs number of traders, mean cluster size, modularity) over time for a network of traders at deltas (10, 15, 30, 60, 120, 180, 360 and 1440 minutes) for EUR/USD currency pair.
  • Figure 2: Proportion vector and normalised proportion vector of temporal evolution for clustering on EURUSD for 10min delta and cutoff 100. The Infomap algorithm was used to identify clusters after the SVN networks was constructed.
  • Figure 3: Evolution of $\theta$ parameter for all $\delta$ time slices and cutoff of 1000 in the fixed amount of 200 most active traders. Other scenarios bear similarities in the shape of the curves i.e. for deltas 360 and 1440 the parameter $\theta$ stays more or less stationary and others increase suddenly at some point.
  • Figure 4: A comparison of empirical and theoretical fit on last sliding window. The plots were obtained using EURUSD data for 10min scale and cutoff 100.
  • Figure 5: Pass rate in percent for all $\delta$ time slices and 100, 500 and 1000 cutoffs. This rate represents the ratio of non rejected null $\chi^2$ hypothesis for all sliding windows
  • ...and 6 more figures