Table of Contents
Fetching ...

Multi-omics network reconstruction with collaborative graphical lasso

Alessio Albanese, Wouter Kohlen, Pariya Behrouzi

TL;DR

Reconstructing integrated networks from multi-omics data requires methods that respect layer-specific contributions while sharing information across layers. The authors introduce coglasso, a collaborative graphical lasso that adds a cross-layer collaboration term and separate within- and between-layer penalties, with XStARS for multi-hyperparameter stability. Simulations demonstrate coglasso's superiority over glasso in both edge recovery and precision-matrix estimation, and a sleep-deprivation mouse study shows biologically coherent connections, validating the approach on real data. The work provides an open-source, general framework for multi-omics network inference with potential applications in neuroscience and beyond.

Abstract

Motivation: In recent years, the availability of multi-omics data has increased substantially. Multi-omics data integration methods mainly aim to leverage different molecular layers to gain a complete molecular description of biological processes. An attractive integration approach is the reconstruction of multi-omics networks. However, the development of effective multi-omics network reconstruction strategies lags behind. Results: In this study, we introduce collaborative graphical lasso, a novel approach that extends graphical lasso by incorporating collaboration between omics layers, thereby improving multi-omics data integration and enhancing network inference. Our method leverages a collaborative penalty term, which harmonizes the contribution of the omics layers to the reconstruction of the network structure. This promotes a cohesive integration of information across modalities, and it is introduced alongside a dual regularization scheme that separately controls sparsity within and between layers. To address the challenge of model selection in this framework, we propose XStARS, a stability-based criterion for multi-dimensional hyperparameter tuning. We assess the performance of collaborative graphical lasso and the corresponding model selection procedure through simulations, and we apply them to publicly available multi-omics data. This application demonstrated collaborative graphical lasso recovers established biological interactions while suggesting novel, biologically coherent connections. Availability and implementation: We implemented collaborative graphical lasso as an R package, available on CRAN as coglasso. The results of the manuscript can be reproduced running the code available at https://github.com/DrQuestion/coglasso_reproducible_code

Multi-omics network reconstruction with collaborative graphical lasso

TL;DR

Reconstructing integrated networks from multi-omics data requires methods that respect layer-specific contributions while sharing information across layers. The authors introduce coglasso, a collaborative graphical lasso that adds a cross-layer collaboration term and separate within- and between-layer penalties, with XStARS for multi-hyperparameter stability. Simulations demonstrate coglasso's superiority over glasso in both edge recovery and precision-matrix estimation, and a sleep-deprivation mouse study shows biologically coherent connections, validating the approach on real data. The work provides an open-source, general framework for multi-omics network inference with potential applications in neuroscience and beyond.

Abstract

Motivation: In recent years, the availability of multi-omics data has increased substantially. Multi-omics data integration methods mainly aim to leverage different molecular layers to gain a complete molecular description of biological processes. An attractive integration approach is the reconstruction of multi-omics networks. However, the development of effective multi-omics network reconstruction strategies lags behind. Results: In this study, we introduce collaborative graphical lasso, a novel approach that extends graphical lasso by incorporating collaboration between omics layers, thereby improving multi-omics data integration and enhancing network inference. Our method leverages a collaborative penalty term, which harmonizes the contribution of the omics layers to the reconstruction of the network structure. This promotes a cohesive integration of information across modalities, and it is introduced alongside a dual regularization scheme that separately controls sparsity within and between layers. To address the challenge of model selection in this framework, we propose XStARS, a stability-based criterion for multi-dimensional hyperparameter tuning. We assess the performance of collaborative graphical lasso and the corresponding model selection procedure through simulations, and we apply them to publicly available multi-omics data. This application demonstrated collaborative graphical lasso recovers established biological interactions while suggesting novel, biologically coherent connections. Availability and implementation: We implemented collaborative graphical lasso as an R package, available on CRAN as coglasso. The results of the manuscript can be reproduced running the code available at https://github.com/DrQuestion/coglasso_reproducible_code
Paper Structure (10 sections, 6 equations, 3 figures)

This paper contains 10 sections, 6 equations, 3 figures.

Figures (3)

  • Figure 1: Results of simulated networks reconstruction for coglasso (in yellow-red) and glasso (in cyan-blue). The performance was measured in terms of network structure recovery (with $F_1$ and MCC) and of estimation of the precision matrix (with Kullback-Leibler divergence), in three increasingly complex and high-dimensional scenarios (networks with $60$, $100$, and $150$ nodes). The panel above shows the results when measuring the $F_1$ and the MCC, while the one below shows the results when measuring the KLD. In each panel the scenarios are distributed along the rows. For each replicate and in all scenarios, the "oracle" measure of each method was taken. This means that the figure reports only the measure of the best achieving network of each method from the grid of explored hyperparameters. Coglasso shows an advantage over glasso according to all measures. The advantage becomes larger as the complexity of the problem increases. Importantly, the $c$ value of the best achieving network of coglasso is larger than zero across all replicates of the two most complex scenarios, and only once for the least complex (see Supplementary Figure S1). This implies a role of collaboration in achieving the best possible network. The violins of the figure are composed of $100$ data points, one for each replicate of the simulations, and the networks were reconstructed from datasets of $n = 50$ observations.
  • Figure 2: Performance of model selection with XStARS for coglasso (in yellow-red) and StARS for glasso (in cyan-blue) over three increasingly complex and high-dimensional scenarios (networks with $60$, $100$, and $150$ nodes). The model selection performance was measured comparing the selected network structure in each replicate with the network structure of the simulated ground truth networks in terms of $F_1$ and MCC. The two measures are distributed along the columns of the grid, and the scenarios along the rows. According to the XStARS selected coglasso network has an especially large advantage over the StARS selected glasso network in the two most complex scenarios (row two and three). The violins of the figure are composed of $100$ data points, one for each replicate of the simulations, and the networks were reconstructed from datasets of $n = 50$ observations.
  • Figure 3: Subnetwork of Cirbp and its neighbouring nodes (left) and second largest community (right) from the coglasso network shown in Supplementary Figure 3. Blue nodes represent transcripts, while pink nodes represent metabolites. Blue edges represent negative partial correlations, while red edges stand for positive partial correlations. There are four line intensities, representing the strength of the edges. Dotted lines represent the first quartile of edge strengths of the network, while full lines represent the last quartile. On the left, node labels with asterisks belong to genes known to be involved with unfolded protein response or protein folding in general. Cirbp shows a negative relation to most of the genes of its neighbourhood.