Table of Contents
Fetching ...

Optimizing the Induced Correlation in Omnibus Joint Graph Embeddings

Konstantinos Pantazis, Michael Trosset, William N. Frost, Carey E. Priebe, Vince Lyzinski

TL;DR

This work presents the first efforts to automate the Omnibus construction in order to address two key questions in this joint embedding framework: the correlation-to-OMNI problem and the flat correlation problem.

Abstract

Theoretical and empirical evidence suggests that joint graph embedding algorithms induce correlation across the networks in the embedding space. In the Omnibus joint graph embedding framework, previous results explicitly delineated the dual effects of the algorithm-induced and model-inherent correlations on the correlation across the embedded networks. Accounting for and mitigating the algorithm-induced correlation is key to subsequent inference, as sub-optimal Omnibus matrix constructions have been demonstrated to lead to loss in inference fidelity. This work presents the first efforts to automate the Omnibus construction in order to address two key questions in this joint embedding framework: the correlation-to-OMNI problem and the flat correlation problem. In the flat correlation problem, we seek to understand the minimum algorithm-induced flat correlation (i.e., the same across all graph pairs) produced by a generalized Omnibus embedding. Working in a subspace of the fully general Omnibus matrices, we prove both a lower bound for this flat correlation and that the classical Omnibus construction induces the maximal flat correlation. In the correlation-to-OMNI problem, we present an algorithm -- named corr2Omni -- that, from a given matrix of estimated pairwise graph correlations, estimates the matrix of generalized Omnibus weights that induces optimal correlation in the embedding space. Moreover, in both simulated and real data settings, we demonstrate the increased effectiveness of our corr2Omni algorithm versus the classical Omnibus construction.

Optimizing the Induced Correlation in Omnibus Joint Graph Embeddings

TL;DR

This work presents the first efforts to automate the Omnibus construction in order to address two key questions in this joint embedding framework: the correlation-to-OMNI problem and the flat correlation problem.

Abstract

Theoretical and empirical evidence suggests that joint graph embedding algorithms induce correlation across the networks in the embedding space. In the Omnibus joint graph embedding framework, previous results explicitly delineated the dual effects of the algorithm-induced and model-inherent correlations on the correlation across the embedded networks. Accounting for and mitigating the algorithm-induced correlation is key to subsequent inference, as sub-optimal Omnibus matrix constructions have been demonstrated to lead to loss in inference fidelity. This work presents the first efforts to automate the Omnibus construction in order to address two key questions in this joint embedding framework: the correlation-to-OMNI problem and the flat correlation problem. In the flat correlation problem, we seek to understand the minimum algorithm-induced flat correlation (i.e., the same across all graph pairs) produced by a generalized Omnibus embedding. Working in a subspace of the fully general Omnibus matrices, we prove both a lower bound for this flat correlation and that the classical Omnibus construction induces the maximal flat correlation. In the correlation-to-OMNI problem, we present an algorithm -- named corr2Omni -- that, from a given matrix of estimated pairwise graph correlations, estimates the matrix of generalized Omnibus weights that induces optimal correlation in the embedding space. Moreover, in both simulated and real data settings, we demonstrate the increased effectiveness of our corr2Omni algorithm versus the classical Omnibus construction.
Paper Structure (21 sections, 5 theorems, 82 equations, 8 figures)

This paper contains 21 sections, 5 theorems, 82 equations, 8 figures.

Key Result

Theorem 1

Let $F$ be a distribution on a set $\mathcal{X}\subset \mathbb{R}^{d}$, where $\langle x, x'\rangle\in[0,1]$ for all $x,x'\in\mathcal{X}$, and assume that $\Delta := \mathbb{E}[X_1 X_1^{T}]$ is rank $d$. Let $(\mathbf{A}_n^{(1)},\mathbf{A}_n^{(2)},\cdots,\mathbf{A}_n^{(m)},\mathbf{X}_n)\sim\mathrm{J where $\widetilde{\Sigma}_{\rho}(x;s_1,s_2)$ is given by where $\Sigma(x) := \Delta^{-1} \mathop{\

Figures (8)

  • Figure 1: We sample $nMC=200$ graph triplets from the $\mathrm{JRDPG}_{\text{gen}}(F,500,3,\rho\mathds{1}_3)$ distribution where $F$ is specified in Section \ref{['sec:simFlat']} and $\rho=0\,(L),\,0.25\,(C),\,0.5\, (R)$. In each panel, we plot $\sqrt{500}(\hat{\mathbf{X}}_1-\hat{\mathbf{X}}_{501})$ where $\hat{\mathbf{X}}$ is the ASE embedding of classical OMNI (in black) and $\mathfrak{M}_{3-}^{W}$ (in red).
  • Figure 2: Vertex--to--vertex Euclidean distance matrices computed from the classical Omni embedding (L) and the corr2Omni embedding (R) in the social network setting of Section \ref{['sec:soc']}.
  • Figure 3: For the social networks considered in Section \ref{['sec:soc']}, we plot row-wise Euclidean distance matrices computed for $\mathbf{A}^{(1)}$ (L), $\mathbf{A}^{({}2)}$ (C), and $\mathbf{A}^{(3)}$ (R) (so that $\mathbf{D}^{(i)}_{k,\ell}=\|\mathbf{A}^{(i)}_{k,\bullet}-\mathbf{A}^{(i)}_{\ell,\bullet}\|_F$).
  • Figure 4: Detail of the obtained $\mathcal{A}$ matrix obtained by corr2Omni (R) and classical OMNI (L) in the DTMRI experiment. Note that the diagonal has been 0'd out to allow for the off-diagonal detail to be observed. Large values ($\leq1$) are lighter grey and, where black denotes a value of 0 and white a value of 1.
  • Figure 5: For the 30 DTMRI networks, we apply the corr2Omni procedure (in the WOMNI setting) to find the weight tensor $\mathbf{C}$ that aims to induce correlation $\mathfrak{R}=0.54 \cdot \mathbf{J}_{30}+0.46 \cdot \mathbf{I}_{30}.$ We embed the collection using both classical OMNI and the corr2Omni weights. Once the collection networks are embedded, we compute the Frobenius norm distance between each graph pair in the embedding and use hierarchical clustering to cluster the collection of 30 graphs into 10 clusters. The clustering dendrograms for classical OMNI (L) and corr2Omni (R) are shown. Lastly, these cluster labels are compared (using Adjusted Rand Index or ARI) to the true patient labels.
  • ...and 3 more figures

Theorems & Definitions (10)

  • Definition 1.1: Generalized Omnibus Matrix
  • Definition 1.2
  • Definition 1.3: Joint Random Dot Product Graph of pantazis2022importance
  • Theorem 1
  • Lemma 1
  • Theorem 2
  • Theorem 3
  • Definition 4.1
  • Lemma 2
  • proof