Table of Contents
Fetching ...

NExON-Bayes: A Bayesian approach to network estimation informed by ordinal covariates

Joseph Feest, Hélène Ruffieux, Camilla Lingjærde, Xiaoyue Xi

TL;DR

NExON-Bayes introduces a Bayesian joint Gaussian graphical modeling framework that accounts for sample-level heterogeneity through ordinal covariates by estimating a set of covariate-specific precision matrices $\{\boldsymbol{\Omega}^{(a)}\}_{a\in\mathcal{A}}$. Edges are included via a spike-and-slab prior, with a probit submodel linking edge inclusion probabilities to the ordinal covariate values through $\delta^{(a)}_{ij}|\zeta_{ij},\beta_{ij} \sim \text{Bernoulli}\{\Phi(\zeta_{ij}+a\beta_{ij})\}$, enabling covariate-dependent networks while maintaining positive definiteness. A deterministic variational Bayes EM (VBECM) algorithm performs scalable inference by factorizing the posterior with $q(\underline{\boldsymbol{\Omega}}, \boldsymbol{\Theta})$ and updating via the ELBO, with spike variance $\nu_0$ selected through a line-search on $\text{BIC}_\gamma$. Across simulations and a TCGA BRCA proteomics dataset, NExON-Bayes demonstrates improved precision/recall over competitors and uncovers covariate-driven changes in pathways and hub proteins, providing interpretable insights into disease progression and potential therapeutic targets.

Abstract

In heterogeneous disease settings, accounting for intrinsic sample variability is crucial for obtaining reliable and interpretable omic network estimates. However, most graphical model analyses of biomedical data assume homogeneous conditional dependence structures, potentially leading to misleading conclusions. To address this, we propose a joint Gaussian graphical model that leverages sample-level ordinal covariates (e.g., disease stage) to account for heterogeneity and improve the estimation of partial correlation structures. Our modelling framework, called NExON-Bayes, extends the graphical spike-and-slab framework to account for ordinal covariates, jointly estimating their relevance to the graph structure and leveraging them to improve the accuracy of network estimation. To scale to high-dimensional omic settings, we develop an efficient variational inference algorithm tailored to our model. Through simulations, we demonstrate that our method outperforms the vanilla graphical spike-and-slab (with no covariate information), as well as other state-of-the-art network approaches which exploit covariate information. Applying our method to reverse phase protein array data from patients diagnosed with stage I, II or III breast carcinoma, we estimate the behaviour of proteomic networks as breast carcinoma progresses. Our model provides insights not only through inspection of the estimated proteomic networks, but also of the estimated ordinal covariate dependencies of key groups of proteins within those networks, offering a comprehensive understanding of how biological pathways shift across disease stages. Availability and Implementation: A user-friendly R package for NExON-Bayes with tutorials is available on Github at github.com/jf687/NExON.

NExON-Bayes: A Bayesian approach to network estimation informed by ordinal covariates

TL;DR

NExON-Bayes introduces a Bayesian joint Gaussian graphical modeling framework that accounts for sample-level heterogeneity through ordinal covariates by estimating a set of covariate-specific precision matrices . Edges are included via a spike-and-slab prior, with a probit submodel linking edge inclusion probabilities to the ordinal covariate values through , enabling covariate-dependent networks while maintaining positive definiteness. A deterministic variational Bayes EM (VBECM) algorithm performs scalable inference by factorizing the posterior with and updating via the ELBO, with spike variance selected through a line-search on . Across simulations and a TCGA BRCA proteomics dataset, NExON-Bayes demonstrates improved precision/recall over competitors and uncovers covariate-driven changes in pathways and hub proteins, providing interpretable insights into disease progression and potential therapeutic targets.

Abstract

In heterogeneous disease settings, accounting for intrinsic sample variability is crucial for obtaining reliable and interpretable omic network estimates. However, most graphical model analyses of biomedical data assume homogeneous conditional dependence structures, potentially leading to misleading conclusions. To address this, we propose a joint Gaussian graphical model that leverages sample-level ordinal covariates (e.g., disease stage) to account for heterogeneity and improve the estimation of partial correlation structures. Our modelling framework, called NExON-Bayes, extends the graphical spike-and-slab framework to account for ordinal covariates, jointly estimating their relevance to the graph structure and leveraging them to improve the accuracy of network estimation. To scale to high-dimensional omic settings, we develop an efficient variational inference algorithm tailored to our model. Through simulations, we demonstrate that our method outperforms the vanilla graphical spike-and-slab (with no covariate information), as well as other state-of-the-art network approaches which exploit covariate information. Applying our method to reverse phase protein array data from patients diagnosed with stage I, II or III breast carcinoma, we estimate the behaviour of proteomic networks as breast carcinoma progresses. Our model provides insights not only through inspection of the estimated proteomic networks, but also of the estimated ordinal covariate dependencies of key groups of proteins within those networks, offering a comprehensive understanding of how biological pathways shift across disease stages. Availability and Implementation: A user-friendly R package for NExON-Bayes with tutorials is available on Github at github.com/jf687/NExON.

Paper Structure

This paper contains 12 sections, 13 equations, 4 figures, 1 table.

Figures (4)

  • Figure 1: Schematic representation of the NExON-Bayes model. Shaded nodes are observed, and non-shaded are latent variables that are inferred.
  • Figure 2: Posterior mean estimates of each $\beta_{ij}$ (lower diagonal part) versus simulated coefficients of the ordinal covariate (upper diagonal part). Note that there is no real 'true' $\mathbf{\beta}$, as the precision matrices are not simulated from the model. However, there is a known structure based on positive ($\beta_{ij}> 0$) and negative ($\beta_{ij}< 0$) dependencies between covariate and partial correlation. The outline of the true structure is overlain on the estimated structure to ensure a clear comparison.
  • Figure 3: Performance of covdepGE, mGHS, NExON-Bayes & SSJGL in the simulated scenario where there are $P = 100$ nodes and $N = 200$ samples for each of $\mid\mathcal{A} \mid = 4$ networks. $50$ edges 'appear' (absolute value of partial correlation linearly increases with covariate) and $50$ edges 'disappear' (absolute value of partial correlation linearly decreases with covariate). Performance is assessed over $100$ replicates for mGHS, NExON-Bayes and SSJGL and $10$ replicates for covdepGE due to excessive runtime for large $N$ ($20$+ hours).
  • Figure 4: Subnetworks showing the edges with the largest absolute posterior point estimates of $\beta_{ij}$ in the BRCA application. The left network shows the edges with the largest positive $\beta_{ij}$ values, and the right network shows the edges with the smallest negative $\beta_{ij}$ values.