Technical Report of HelixFold3 for Biomolecular Structure Prediction
Lihang Liu, Shanzhuo Zhang, Yang Xue, Xianbin Ye, Kunrui Zhu, Yuxin Li, Yang Liu, Jie Gao, Wenlai Zhao, Hongkun Yu, Zhihua Wu, Xiaonan Zhang, Xiaomin Fang
TL;DR
HelixFold3 targets a replication of AlphaFold3's biomolecular structure prediction capabilities, including ligands, nucleic acids, and protein complexes. Built on prior PaddleHelix work and training on pre-2021 PDB data with self-distillation, it achieves competitive accuracy and is released open-source for academia with an online visualization and API service. Comprehensive benchmarking across PoseBusters, CASP15 RNA targets, PDB/SAbDab protein complexes, and covalent modifications shows HelixFold3 matching or surpassing several baselines and even AlphaFold3 in specific tasks, while remaining behind in some protein–protein scenarios. The work underscores the potential of accessible, diffusion-based, open pipelines to accelerate biomolecular discovery, with ongoing improvements and wider data coverage planned.
Abstract
The AlphaFold series has transformed protein structure prediction with remarkable accuracy, often matching experimental methods. AlphaFold2, AlphaFold-Multimer, and the latest AlphaFold3 represent significant strides in predicting single protein chains, protein complexes, and biomolecular structures. While AlphaFold2 and AlphaFold-Multimer are open-sourced, facilitating rapid and reliable predictions, AlphaFold3 remains partially accessible through a limited online server and has not been open-sourced, restricting further development. To address these challenges, the PaddleHelix team is developing HelixFold3, aiming to replicate AlphaFold3's capabilities. Leveraging insights from previous models and extensive datasets, HelixFold3 achieves accuracy comparable to AlphaFold3 in predicting the structures of the conventional ligands, nucleic acids, and proteins. The initial release of HelixFold3 is available as open source on GitHub for academic research, promising to advance biomolecular research and accelerate discoveries. The latest version will be continuously updated on the HelixFold3 web server, providing both interactive visualization and API access.
