Table of Contents
Fetching ...

ContactNet: Geometric-Based Deep Learning Model for Predicting Protein-Protein Interactions

Matan Halfon, Tomer Cohen, Raanan Fattal, Dina Schneidman-Duhovny

TL;DR

ContactNet introduces a geometry-aware, attention-based Graph Neural Network to classify docked protein–protein interaction models without relying on MSA signals. The model couples a distance-aware graph attention embedding of residues, a segment-centric contact descriptor formed from interacting patches, and an interaction transformer to produce a final docked-model score. Empirical results on antibody–antigen docking show significant improvements over SOAP-PP and AFM baselines, with Top-1/Top-5 accuracies of 68%/75% on unbound antibodies and about 43% Top-10 on modeled antibodies, plus 50% Top-1 and 70% Top-10 in epitope predictions. The MSAs-free approach generalizes beyond antibodies to broader PPIs, offering a scalable solution for docking assessment in settings where co-evolutionary signals are unavailable.

Abstract

Deep learning approaches achieved significant progress in predicting protein structures. These methods are often applied to protein-protein interactions (PPIs) yet require Multiple Sequence Alignment (MSA) which is unavailable for various interactions, such as antibody-antigen. Computational docking methods are capable of sampling accurate complex models, but also produce thousands of invalid configurations. The design of scoring functions for identifying accurate models is a long-standing challenge. We develop a novel attention-based Graph Neural Network (GNN), ContactNet, for classifying PPI models obtained from docking algorithms into accurate and incorrect ones. When trained on docked antigen and modeled antibody structures, ContactNet doubles the accuracy of current state-of-the-art scoring functions, achieving accurate models among its Top-10 at 43% of the test cases. When applied to unbound antibodies, its Top-10 accuracy increases to 65%. This performance is achieved without MSA and the approach is applicable to other types of interactions, such as host-pathogens or general PPIs.

ContactNet: Geometric-Based Deep Learning Model for Predicting Protein-Protein Interactions

TL;DR

ContactNet introduces a geometry-aware, attention-based Graph Neural Network to classify docked protein–protein interaction models without relying on MSA signals. The model couples a distance-aware graph attention embedding of residues, a segment-centric contact descriptor formed from interacting patches, and an interaction transformer to produce a final docked-model score. Empirical results on antibody–antigen docking show significant improvements over SOAP-PP and AFM baselines, with Top-1/Top-5 accuracies of 68%/75% on unbound antibodies and about 43% Top-10 on modeled antibodies, plus 50% Top-1 and 70% Top-10 in epitope predictions. The MSAs-free approach generalizes beyond antibodies to broader PPIs, offering a scalable solution for docking assessment in settings where co-evolutionary signals are unavailable.

Abstract

Deep learning approaches achieved significant progress in predicting protein structures. These methods are often applied to protein-protein interactions (PPIs) yet require Multiple Sequence Alignment (MSA) which is unavailable for various interactions, such as antibody-antigen. Computational docking methods are capable of sampling accurate complex models, but also produce thousands of invalid configurations. The design of scoring functions for identifying accurate models is a long-standing challenge. We develop a novel attention-based Graph Neural Network (GNN), ContactNet, for classifying PPI models obtained from docking algorithms into accurate and incorrect ones. When trained on docked antigen and modeled antibody structures, ContactNet doubles the accuracy of current state-of-the-art scoring functions, achieving accurate models among its Top-10 at 43% of the test cases. When applied to unbound antibodies, its Top-10 accuracy increases to 65%. This performance is achieved without MSA and the approach is applicable to other types of interactions, such as host-pathogens or general PPIs.

Paper Structure

This paper contains 12 sections, 4 figures.

Figures (4)

  • Figure 1: ContactNet architecture.A Single protein embedding module. This module uses sequence, secondary structure, solvent accessibility, and the $C\alpha-C\alpha$ protein distance matrix. The network learns a new representation of physico-chemical features using an encoder transformer. B Contacts embedding module. This module extracts the most contacting and possibly interacting linear segments from the single protein embedding stage and encodes them into interaction descriptors. C Interaction transformer module. The transformer module is trained to classify the entire interaction interfaces based on these embedded contact descriptors.
  • Figure 2: Prediction protocol for antigen-antibody complexes. The antibody sequence is modeled via AFM to obtain five modeled structures that are docked using PatchDock. The top docking models ranked by SOAP-PP score are reevaluated by ContactNet. Finally, the docking models from the five AFM antibody models are clustered to reduce redundancy.
  • Figure 3: Performance of ContactNet, AFM, and SOAP-PP on the modeled antibodies test set.A. Success rate for TopN predictions for ContactNet (blue), AFM (green), and SOAP-PP (orange). The red line indicates the upper bound for ContactNet and SOAP-PP due to missing acceptable or higher accuracy models.B. Top5 models of each method divided into high, medium, and acceptable accuracy. C. Success rate for topN predictions on the unbound test set for ContactNet, AFM, and SOAP-PP. D. Success rate for topN in predicting epitopes for ContactNet, AFM, and SOAP-PP.
  • Figure 4: Figure 1S. Funnels for scoring functions.A. ContactNet, AFM, and SOAP-PP funnels as a function of interface RMSD for the three cases from the test set (1KIP, 5EZO, and 5IAI). Each dot corresponds to a docking model. Acceptable accuracy models are shown in red. B. The top scoring models for each scoring function (ContactNet - blue, AFM - green, SOAP-PP- orange) are shown at the bottom (overlaid on the gray X-ray structure).