Dihedral Angle Adherence: Evaluating Protein Structure Predictions in the Absence of Experimental Data
Musa Azeem, Homayoun Valafar
TL;DR
The paper addresses the challenge of evaluating protein structure predictions without experimental ground-truth data. It introduces a dihedral-adherence metric, computed per residue, by mining context-specific $\phi$ and $\psi$ angle distributions from the Protein Data Bank via windowed comparisons, clustering, and Mahalanobis distance to predicted angles. The method shows a significant correlation with RMSD across CASP-14 predictions ($R^2 = 0.755$, $p<0.01$), while also revealing residue-level locations where predictions underperform, enabling targeted improvements. This reference-free evaluation framework offers practical value for guiding development and refinement of protein structure predictions, including AlphaFold outputs. It thus provides a scalable, insight-rich alternative to RMSD for monitoring and enhancing predictive accuracy in the absence of ground-truth structures.
Abstract
Determining the 3D structures of proteins is essential in understanding their behavior in the cellular environment. Computational methods of predicting protein structures have advanced, but assessing prediction accuracy remains a challenge. The traditional method, RMSD, relies on experimentally determined structures and lacks insight into improvement areas of predictions. We propose an alternative: analyzing dihedral angles, bypassing the need for the reference structure of an evaluated protein. Our method segments proteins into amino acid subsequences and searches for matches, comparing dihedral angles across numerous proteins to compute a metric using Mahalanobis distance. Evaluated on many predictions, our approach correlates with RMSD and identifies areas for prediction enhancement. This method offers a promising route for accurate protein structure prediction assessment and improvement.
