Table of Contents
Fetching ...

From sectorial coarse graining to extreme coarse graining of S&P 500 correlation matrices

Manan Vyas, M. Mijaíl Martínez-Ramos, Parisa Majari, Thomas H. Seligman

TL;DR

The paper addresses the high dimensionality of Pearson correlation matrices for stock returns and proposes extreme coarse graining (ECG) to a real symmetric $2\times 2$ matrix by two-block averaging, preserving the average correlation as a key parameter, and compares this to sectorial coarse graining (CG) which yields a $10\times10$ matrix. Using 322 S&P 500 stocks over 2006–2023 ($T=4430$) with epochs of $L=20$ days and $k$-means clustering, the authors test three block-choices to form the ECG and analyze market-state dynamics. ECG produces a three-parameter representation $(x,y,z)$ that captures essential features of market transitions with comparable qualitative structure to CG, though certain relationships (e.g., average correlation to $\lambda_{max}$) differ in sign. The study demonstrates that significant dimensionality reduction is possible without losing the core dynamical picture, offering a compact framework for visualizing market states and suggesting avenues for extension to other markets and noise-robust techniques such as Power-Map or wavelets.

Abstract

Starting from the Pearson Correlation Matrix of stock returns and from the desire to obtain a reduced number of parameters relevant for the dynamics of a financial market, we propose to take the idea of a sectorial matrix, which would have a large number of parameters, to the reduced picture of a real symmetric $2 \times 2$ matrix, extreme case, that still conserves the desirable feature that the average correlation can be one of the parameters. This is achieved by averaging the correlation matrix over blocks created by choosing two subsets of stocks for rows and columns and averaging over each of the resulting blocks. Averaging over these blocks, we retain the average of the correlation matrix. We shall use a random selection for two equal block sizes as well as two specific, hopefully relevant, ones that do not produce equal block sizes. The results show that one of the non-random choices has somewhat different properties, whose meaning will have to be analyzed from an economy point of view.

From sectorial coarse graining to extreme coarse graining of S&P 500 correlation matrices

TL;DR

The paper addresses the high dimensionality of Pearson correlation matrices for stock returns and proposes extreme coarse graining (ECG) to a real symmetric matrix by two-block averaging, preserving the average correlation as a key parameter, and compares this to sectorial coarse graining (CG) which yields a matrix. Using 322 S&P 500 stocks over 2006–2023 () with epochs of days and -means clustering, the authors test three block-choices to form the ECG and analyze market-state dynamics. ECG produces a three-parameter representation that captures essential features of market transitions with comparable qualitative structure to CG, though certain relationships (e.g., average correlation to ) differ in sign. The study demonstrates that significant dimensionality reduction is possible without losing the core dynamical picture, offering a compact framework for visualizing market states and suggesting avenues for extension to other markets and noise-robust techniques such as Power-Map or wavelets.

Abstract

Starting from the Pearson Correlation Matrix of stock returns and from the desire to obtain a reduced number of parameters relevant for the dynamics of a financial market, we propose to take the idea of a sectorial matrix, which would have a large number of parameters, to the reduced picture of a real symmetric matrix, extreme case, that still conserves the desirable feature that the average correlation can be one of the parameters. This is achieved by averaging the correlation matrix over blocks created by choosing two subsets of stocks for rows and columns and averaging over each of the resulting blocks. Averaging over these blocks, we retain the average of the correlation matrix. We shall use a random selection for two equal block sizes as well as two specific, hopefully relevant, ones that do not produce equal block sizes. The results show that one of the non-random choices has somewhat different properties, whose meaning will have to be analyzed from an economy point of view.

Paper Structure

This paper contains 5 sections, 2 equations, 11 figures, 1 table.

Figures (11)

  • Figure 1: Pearson correlation matrix $C$ defined by Eq. \ref{['eq:2']} of the S&P 500 data in a time horizon from January 3rd 2006 to August 10th 2023. Pearson correlation matrix elements are computed using logarithmic return time series of adjusted closing prices.
  • Figure 2: Choices 1, 2 and 3 for constructing the $2 \times 2$ correlation matrices. Note that in Choice 1, we choose sectors with strong intra-sectorial correlations (EG, FN, TC, UT) as the first block and rest of the sectors as second block; in Choice 2, we choose sectors with strong inter-sectorial correlations (CD, FN, ID) as first block and rest of the sectors as second block; and in Choice 3, we randomly choose equal number of stocks for each block. The choice for the blocks in each case are marked with color red. Note that the blocks are scaled according to the number of stocks in the particular block.
  • Figure 3: Time evolution of market states of the S&P 500 data using (a) CG and (b)-(d) ECG Pearson correlation matrix $C$ defined by Eq. \ref{['eq:2']} in a time horizon from January 3rd 2006 to August 10th 2023 with an epoch of 20 trading days. Pearson correlation matrix elements are computed using logarithmic return time series of adjusted closing prices. The market states are arranged in order of increasing average correlations. The average correlations for the states are (a) 0.1586, 0.2654, 0.3734, 0.4842, 0.6462; (b) 0.1496, 0.2613, 0.3706, 0.4838, 0.6465; (c) 0.165, 0.269, 0.372, 0.484, 0.643; and (d) 0.1493, 0.2594, 0.3704, 0.4809, 0.6429; respectively. The Pearson correlation coefficients among all combinations between Figs. (a)-(d) are above $0.92$.
  • Figure 4: (a) Time evolution of market states of the S&P 500 data using CG Pearson correlation matrix $C$ defined by Eq. \ref{['eq:2']} in a time horizon from January 3rd 2006 to August 10th 2023 with an epoch of 20 trading days for CG Pearson correlation matrices. Each state is represented by a different color and dashed horizontal lines indicate the dates of stock market crashes; see Table \ref{['tab1']} for details. Time evolution of (b) average correlations, (c) largest eigenvalue, and (d) smallest eigenvalue. The Pearson correlation coefficients between average correlation and $\lambda_{max}$ is 0.998, average correlation and $\lambda_{min}$ is 0.364, $\lambda_{max}$ and $\lambda_{min}$ is 0.350.
  • Figure 5: (a) Time evolution of market states of the S&P 500 data using ECG Pearson correlation matrix $C$ defined by Eq. \ref{['eq:2']} in a time horizon from January 3rd 2006 to August 10th 2023 with an epoch of 20 trading days for Choice 1, same as in Fig. \ref{['fig:2']}. Each state is represented by a different color and dashed horizontal lines indicate the dates of stock market crashes; see Table \ref{['tab1']} for details. Time evolution of (b) average correlations, (c) largest eigenvalue, and (d) smallest eigenvalue. The Pearson correlation coefficients between average correlations and $\lambda_{max}$ is 0.999, average correlations and $\lambda_{min}$ is -0.311, $\lambda_{max}$ and $\lambda_{min}$ is -0.328.
  • ...and 6 more figures