Mechanisms of AI Protein Folding in ESMFold
Kevin Lu, Jannik Brinkmann, Stefan Huber, Aaron Mueller, Yonatan Belinkov, David Bau, Chris Wendler
TL;DR
This work probes how AI protein folding models derive structure from sequence by dissecting ESMFold's folding trunk. Through activation patching, it identifies two sequential computational stages: early blocks propagate sequence-derived biochemical signals into the pairwise representation, while late blocks refine pairwise geometric features that determine the final coordinates. The study shows that the pairwise representation functions as a distance map and that pair2seq biases mediate geometry-to-sequence communication, with causal interventions such as charge steering and distance steering producing expected structural effects. These findings offer a mechanistic, causal understanding of folding in a state-of-the-art model and suggest generalizable stages across secondary-structure motifs. The work advances interpretability in protein structure prediction by localizing computations within the trunk and demonstrating controllable interventions that influence folding outcomes.
Abstract
How do protein structure prediction models fold proteins? We investigate this question by tracing how ESMFold folds a beta hairpin, a prevalent structural motif. Through counterfactual interventions on model latents, we identify two computational stages in the folding trunk. In the first stage, early blocks initialize pairwise biochemical signals: residue identities and associated biochemical features such as charge flow from sequence representations into pairwise representations. In the second stage, late blocks develop pairwise spatial features: distance and contact information accumulate in the pairwise representation. We demonstrate that the mechanisms underlying structural decisions of ESMFold can be localized, traced through interpretable representations, and manipulated with strong causal effects.
