Table of Contents
Fetching ...

Eigenvalue Distribution of Empirical Correlation Matrices for Multiscale Complex Systems and Application to Financial Data

Luan M. T. de Moraes, Antônio M. S. Macêdo, Giovani L. Vasconcelos, Raydonal Ospina

TL;DR

These findings not only support the turbulent market hypothesis as a source of noise but also provide a practical framework for noise reduction in empirical correlation matrices, enhancing the inference of true market correlations between assets.

Abstract

We introduce a method for describing eigenvalue distributions of correlation matrices from multidimensional time series. Using our newly developed matrix H theory, we improve the description of eigenvalue spectra for empirical correlation matrices in multivariate financial data by considering an informational cascade modeled as a hierarchical structure akin to the Kolmogorov statistical theory of turbulence. Our approach extends the Marchenko-Pastur distribution to account for distinct characteristic scales, capturing a larger fraction of data variance, and challenging the traditional view of noise-dressed financial markets. We conjecture that the effectiveness of our method stems from the increased complexity in financial markets, reflected by new characteristic scales and the growth of computational trading. These findings not only support the turbulent market hypothesis as a source of noise but also provide a practical framework for noise reduction in empirical correlation matrices, enhancing the inference of true market correlations between assets.

Eigenvalue Distribution of Empirical Correlation Matrices for Multiscale Complex Systems and Application to Financial Data

TL;DR

These findings not only support the turbulent market hypothesis as a source of noise but also provide a practical framework for noise reduction in empirical correlation matrices, enhancing the inference of true market correlations between assets.

Abstract

We introduce a method for describing eigenvalue distributions of correlation matrices from multidimensional time series. Using our newly developed matrix H theory, we improve the description of eigenvalue spectra for empirical correlation matrices in multivariate financial data by considering an informational cascade modeled as a hierarchical structure akin to the Kolmogorov statistical theory of turbulence. Our approach extends the Marchenko-Pastur distribution to account for distinct characteristic scales, capturing a larger fraction of data variance, and challenging the traditional view of noise-dressed financial markets. We conjecture that the effectiveness of our method stems from the increased complexity in financial markets, reflected by new characteristic scales and the growth of computational trading. These findings not only support the turbulent market hypothesis as a source of noise but also provide a practical framework for noise reduction in empirical correlation matrices, enhancing the inference of true market correlations between assets.

Paper Structure

This paper contains 20 sections, 59 equations, 8 figures, 1 table.

Figures (8)

  • Figure 1: Theoretical eigenvalue distributions $\rho_N(\lambda)$ for the Wishart class with fixed $q = \varepsilon_0 = 0.5$ and several values of $N$ and $\beta$. In (a) we have $\beta= 1.0$ and $N=1, 2, 3$; whereas in (b) $N=1$ and $\beta=0.5, 1.0, 1.5$. The corresponding Marchenko-Pastur distribution (MP) is also shown. The insets show in semi-log scale the asymptotic behavior of the distributions, which is predicted by Eq. (\ref{['eq:tailrhoW']}).
  • Figure 2: Theoretical eigenvalue distribution $\rho_N(\lambda)$ for the inverse Wishart class, with fixed $q = \varepsilon_0 = 0.5$ and several values of $N$ and $\beta$. In (a) we have $\beta= 1.0$ and $N=1, 2, 3$; whereas in (b) $N=1$ and $\beta=0.5, 1.0, 1.5$. The corresponding Marchenko-Pastur distribution (MP) is also shown. The insets show in log-log scale the power-law tails of the distributions, as predicted by Eq. (\ref{['eq:tailrhoIW']}).
  • Figure 3: Empirical correlation matrix of stock returns in the S&P 500 between 5/1/2020 to 5/1/2025. The clustering in the heatmap is obtained using the hierarchical clustering method based on the correlation distance mantegna1999introduction. This matrix captures the observed correlations, which include both true market structure and finite-size noise.
  • Figure 4: Main figure: MP-Distribution with $q = 424/1259$ as a fixed parameter and $\varepsilon_0 = 0.29$, obtained using the least minimum squares. Inset: The empirical distribution of eigenvalues in large scale, showing that eigenvalues ranges from units to hundreds.
  • Figure 5: The aggregated signal histogram and the recovered probability density function with $L=18$ using the methods reported in Section \ref{['HMT']}. The recovered PDF describes nicely the aggregated histogram.
  • ...and 3 more figures