Assessment of scoring functions for computational models of protein-protein interfaces
Jacob Sumner, Grace Meng, Naomi Brandt, Alex T. Grigas, Andrés Córdoba, Mark D. Shattuck, Corey S. O'Hern
TL;DR
The paper benchmarks seven PPI scoring functions on uniformly sampled rigid-body re-docked heterodimers, quantifying how scores correlate with DockQ across targets and datasets. It identifies two physical interface features—interface contact count Nc and separability S—that strongly influence scoring difficulty, and demonstrates that a two-feature SVR using these features matches or exceeds existing scoring functions. It shows that sampling in DockQ is crucial for fair evaluation and that scoring performance degrades as monomers move away from bound conformations, highlighting limitations in flexible docking. The authors advocate incorporating physically discriminative features into scoring models to improve PPI predictions and CAPRI-style assessments, and suggest future work on physics-informed learning and integration with GNNs.
Abstract
A goal of computational studies of protein-protein interfaces (PPIs) is to predict the binding site between two monomers that form a heterodimer. The simplest version of this problem is to rigidly re-dock the bound forms of the monomers, which involves generating computational models of the heterodimer and then scoring them to determine the most native-like models. Scoring functions have been assessed previously using rank- and classification-based metrics, however, these methods are sensitive to the number and quality of models in the scoring function training set. We assess the accuracy of seven PPI scoring functions by comparing their scores to a measure of structural similarity to the x-ray crystal structure (i.e. the DockQ score) for a non-redundant set of heterodimers from the Protein Data Bank. For each heterodimer, we generate re-docked models uniformly sampled over DockQ and calculate the Spearman correlation between the PPI scores and DockQ. For some targets, the scores and DockQ are highly correlated; however, for many targets, there are weak correlations. Several physical features can explain the difference between difficult- and easy-to-score targets. For example, strong correlations exist between the score and DockQ for targets with highly intertwined monomers and many interface contacts. We also develop a new score based on only three physical features that matches or exceeds the performance of current PPI scoring functions. These results emphasize that PPI prediction can be improved by focusing on correlations between the PPI score and DockQ and incorporating more discriminating physical features into PPI scoring functions.
