PSBench: a large-scale benchmark for estimating the accuracy of protein complex structural models

Pawan Neupane; Jian Liu; Jianlin Cheng

PSBench: a large-scale benchmark for estimating the accuracy of protein complex structural models

Pawan Neupane, Jian Liu, Jianlin Cheng

TL;DR

PSBench delivers a large-scale, publicly available benchmark for estimating the accuracy of protein complex structural models, addressing EMA data scarcity with four CASP-derived datasets (CASP15/16) totaling over one million models annotated with 10 quality scores across global, interface, and local levels. It provides automated labeling tools, baseline EMA methods, and standardized metrics to enable rigorous training and benchmarking, demonstrated by the strong performance of GATE-based EMA models in CASP16. This resource supports development of generalizable EMA methods for complex structures and was shown to drive competitive model ranking and selection in blind community-wide evaluations. PSBench thus offers a practical, scalable framework akin to ImageNet for EMA research in protein complex modeling, with ongoing plans to expand targets and invite community contributions.

Abstract

Predicting protein complex structures is essential for protein function analysis, protein design, and drug discovery. While AI methods like AlphaFold can predict accurate structural models for many protein complexes, reliably estimating the quality of these predicted models (estimation of model accuracy, or EMA) for model ranking and selection remains a major challenge. A key barrier to developing effective machine learning-based EMA methods is the lack of large, diverse, and well-annotated datasets for training and evaluation. To address this gap, we introduce PSBench, a benchmark suite comprising four large-scale, labeled datasets generated during the 15th and 16th community-wide Critical Assessment of Protein Structure Prediction (CASP15 and CASP16). PSBench includes over one million structural models covering a wide range of protein sequence lengths, complex stoichiometries, functional classes, and modeling difficulties. Each model is annotated with multiple complementary quality scores at the global, local, and interface levels. PSBench also provides multiple evaluation metrics and baseline EMA methods to facilitate rigorous comparisons. To demonstrate PSBench's utility, we trained and evaluated GATE, a graph transformer-based EMA method, on the CASP15 data. GATE was blindly tested in CASP16 (2024), where it ranked among the top-performing EMA methods. These results highlight PSBench as a valuable resource for advancing EMA research in protein complex modeling. PSBench is publicly available at: https://github.com/BioinfoMachineLearning/PSBench.

PSBench: a large-scale benchmark for estimating the accuracy of protein complex structural models

TL;DR

Abstract

PSBench: a large-scale benchmark for estimating the accuracy of protein complex structural models

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (11)