The Signed Two-Space Proximity Model for Learning Representations in Protein-Protein Interaction Networks
Nikolaos Nakis, Chrysoula Kosma, Anastasia Brativnyk, Michail Chatzianastasis, Iakovos Evdaimon, Michalis Vazirgiannis
TL;DR
The paper tackles learning representations for signed protein-protein interaction (PPI) networks that include activating (positive) and inhibiting (negative) interactions. It introduces the Signed Two-Space Proximity Model (S2-SPM), which uses two independent latent spaces and archetypal analysis to model positive and negative interactions separately, employing a Skellam likelihood to reconstruct the signed graph. The work demonstrates superior signed link prediction, robust BNMI-based recovery of archetype structures, and biologically meaningful GO-term enrichment for archetypes, validating both interpretability and functional relevance. By decoupling the positive and negative spaces, S2-SPM offers a principled and interpretable framework for decoding complex regulatory patterns in SPPI networks, with publicly available SIGNOR data and GO annotations enabling further exploration.
Abstract
Accurately predicting complex protein-protein interactions (PPIs) is crucial for decoding biological processes, from cellular functioning to disease mechanisms. However, experimental methods for determining PPIs are computationally expensive. Thus, attention has been recently drawn to machine learning approaches. Furthermore, insufficient effort has been made toward analyzing signed PPI networks, which capture both activating (positive) and inhibitory (negative) interactions. To accurately represent biological relationships, we present the Signed Two-Space Proximity Model (S2-SPM) for signed PPI networks, which explicitly incorporates both types of interactions, reflecting the complex regulatory mechanisms within biological systems. This is achieved by leveraging two independent latent spaces to differentiate between positive and negative interactions while representing protein similarity through proximity in these spaces. Our approach also enables the identification of archetypes representing extreme protein profiles. S2-SPM's superior performance in predicting the presence and sign of interactions in SPPI networks is demonstrated in link prediction tasks against relevant baseline methods. Additionally, the biological prevalence of the identified archetypes is confirmed by an enrichment analysis of Gene Ontology (GO) terms, which reveals that distinct biological tasks are associated with archetypal groups formed by both interactions. This study is also validated regarding statistical significance and sensitivity analysis, providing insights into the functional roles of different interaction types. Finally, the robustness and consistency of the extracted archetype structures are confirmed using the Bayesian Normalized Mutual Information (BNMI) metric, proving the model's reliability in capturing meaningful SPPI patterns.
