Table of Contents
Fetching ...

Adjacency Spectral Embeddings of Correlation Networks

Keith Levin

TL;DR

It is shown that when the time series are expressible in terms of a small number of Fourier basis elements (or in some other suitably-chosen basis), correlation networks correspond to latent space networks with dependent edge noise in which the vertex-level latent variables encode the basis coefficients.

Abstract

In many applications, weighted networks are constructed based on time series data: each time series is associated to a vertex and edge weights are given by pairwise correlations. The result is a network whose edge dependency structure violates the assumptions of most common network models. Nonetheless, it is common to analyze these "correlation networks" using embedding methods derived from edge-independent network models, based on a belief that the edges are approximately independent. In this work, we put this modeling choice on firm theoretical ground. We show that when the time series are expressible in terms of a small number of Fourier basis elements (or in some other suitably-chosen basis), correlation networks correspond to latent space networks with dependent edge noise in which the vertex-level latent variables encode the basis coefficients. Further, we show that when time series are observed subject to noise, spectral embedding of the resulting noisy correlation network still recovers these true vertex-level latent representations under suitable assumptions. This characterization of embeddings as learning Fourier coefficients appears to be folklore in the signal processing community in the context of principal component analysis, but is, to the best of our knowledge, new to the statistical network analysis literature.

Adjacency Spectral Embeddings of Correlation Networks

TL;DR

It is shown that when the time series are expressible in terms of a small number of Fourier basis elements (or in some other suitably-chosen basis), correlation networks correspond to latent space networks with dependent edge noise in which the vertex-level latent variables encode the basis coefficients.

Abstract

In many applications, weighted networks are constructed based on time series data: each time series is associated to a vertex and edge weights are given by pairwise correlations. The result is a network whose edge dependency structure violates the assumptions of most common network models. Nonetheless, it is common to analyze these "correlation networks" using embedding methods derived from edge-independent network models, based on a belief that the edges are approximately independent. In this work, we put this modeling choice on firm theoretical ground. We show that when the time series are expressible in terms of a small number of Fourier basis elements (or in some other suitably-chosen basis), correlation networks correspond to latent space networks with dependent edge noise in which the vertex-level latent variables encode the basis coefficients. Further, we show that when time series are observed subject to noise, spectral embedding of the resulting noisy correlation network still recovers these true vertex-level latent representations under suitable assumptions. This characterization of embeddings as learning Fourier coefficients appears to be folklore in the signal processing community in the context of principal component analysis, but is, to the best of our knowledge, new to the statistical network analysis literature.
Paper Structure (18 sections, 21 theorems, 221 equations, 6 figures)

This paper contains 18 sections, 21 theorems, 221 equations, 6 figures.

Key Result

Lemma 1

If the matrix ${\mathbf{Z^\star}}$ has all real entries, then so does the matrix $\mathbf{F^\star} \mathbf{K}^{1/2}$, and these entries are given by

Figures (6)

  • Figure 1: Estimation error in $({2,\infty})$-norm as a function of variance parameter $\nu$ by the ASE (blue circles), PCA (purple squares) and the naïve baseline (green triangles), when applied to $n=200$ time series of length $T=200$ for four choices of Fourier basis size $d_0 = 10,20,30,50$.
  • Figure 2: Estimation error in $({2,\infty})$-norm as a function of variance factor $\alpha$ by the ASE (blue circles), PCA (purple squares) and the naïve baseline (green triangles), as applied to $n=200$ time series of length $T=500$ with Fourier basis size $d_0 = 20$. Shaded regions indicate two standard errors of the mean.
  • Figure 3: Estimation error in $({2,\infty})$-norm as a function of the number of frequencies $d_0$ for ASE (blue circles), PCA (purple squares) and naïve baseline (green triangles), when applied to $n=1200$ time series of length $T=1800$ under three different choices of noise variance $\nu$.
  • Figure 4: Estimation error in $({2,\infty})$-norm as a function of the number of time series $n$ by the ASE (blue circles), PCA (purple squares) and the naïve baseline (green triangles), as applied to time series of length $T=1000$ under three different choices of $d_0$.
  • Figure 5: $({2,\infty})$-norm estimation error as a function of entrywise noise variance under Gaussian (top) and Laplacian (bottom) noise for time series of length $T=100$ (left) $T=500$ (middle) and $T=1000$ (right) and Gaussian (top) and Laplacian (bottom) noise. Performance of three estimators is shown: the ASE (blue circles), PCA (purple squares) and a naïve baseline (green triangles). Each data point indicates the mean of $50$ Monte Carlo estimates. Error bars (obscured by the lines) indicate two standard errors of the mean.
  • ...and 1 more figures

Theorems & Definitions (43)

  • Remark 1
  • Lemma 1
  • Lemma 2
  • Remark 2
  • Theorem 1
  • proof : Proof of Lemma \ref{['lem:FsqrtK:real']}
  • Lemma 3
  • proof
  • Lemma 4
  • proof
  • ...and 33 more