Enhancing Reinforcement learning in 3-Dimensional Hydrophobic-Polar Protein Folding Model with Attention-based layers
Peizheng Liu, Hitoshi Iba
TL;DR
This paper tackles 3D hydrophobic–polar protein folding by casting folding as a reinforcement learning task within a self-avoiding walk environment and solving it with a Transformer-based DQN enhanced by dueling and double Q-learning, prioritized replay, and symmetry-breaking constraints. The approach represents states as sequences processed by Transformer encoders, yielding a global state representation via a CLS token and leveraging positional encoding to capture residue order. Empirical results on standard benchmarks show competitive performance, achieving best-known values for several sequences and near-optimal results for longer chains, while also revealing sensitivity to hyperparameters and exploration strategies. The work demonstrates the potential of attention-based reinforcement learning for lattice HP folding and highlights clear avenues for future improvements, including systematic ablations, advanced exploration methods, and extensions to 2D/3D folding problems.
Abstract
Transformer-based architectures have recently propelled advances in sequence modeling across domains, but their application to the hydrophobic-hydrophilic (H-P) model for protein folding remains relatively unexplored. In this work, we adapt a Deep Q-Network (DQN) integrated with attention mechanisms (Transformers) to address the 3D H-P protein folding problem. Our system formulates folding decisions as a self-avoiding walk in a reinforced environment, and employs a specialized reward function based on favorable hydrophobic interactions. To improve performance, the method incorporates validity check including symmetry-breaking constraints, dueling and double Q-learning, and prioritized replay to focus learning on critical transitions. Experimental evaluations on standard benchmark sequences demonstrate that our approach achieves several known best solutions for shorter sequences, and obtains near-optimal results for longer chains. This study underscores the promise of attention-based reinforcement learning for protein folding, and created a prototype of Transformer-based Q-network structure for 3-dimensional lattice models.
