Pearson Correlations on Networks: Corrigendum
Michele Coscia, Karel Devriendt
TL;DR
This note identifies a fundamental constraint for extending Pearson correlations to networks: the weighting matrix $W$ used to couple node values must be positive definite on $\operatorname{span}(1)^{\perp}$ to avoid imaginary or unbounded results. It shows that using the standard shortest-path distance to define $W=e^{-kP}$ fails this condition in general. The authors formalize the requirement and propose two natural negative type metric approaches—effective resistance and Euclidean node embeddings—which guarantee well-defined network correlations through $W=e^{-kD}$. They provide practical guidance, tests, and public code to enable robust calculation of network correlations in graphs. The results have immediate impact on any method that relies on distance-aware correlations over networks and clarify when standard choices are valid.
Abstract
Recently, the first author proposed a measure to calculate Pearson correlations for node values expressed in a network, by taking into account distances or metrics defined on the network. In this technical note, we show that using an arbitrary choice of distances might result in imaginary or unbounded correlation values, which is undesired. We prove that this problem is solved by restricting to a special class of distances: negative type metrics. We also discuss two natural classes of negative type metrics on graphs, for which the network correlations are properly defined.
