Table of Contents
Fetching ...

Protein Structure Prediction in the 3D HP Model Using Deep Reinforcement Learning

Giovanny Espitia, Yui Tik Pang, James C. Gumbart

TL;DR

The study addresses protein structure prediction in the 3D HP lattice model, reframing folding as energy minimization via hydrophobic contacts with $E = -\big(\text{number of valid H-H contacts}\big)$. It introduces two DRL architectures—a reservoir-based hybrid (FFNN-R) and an LSTM with multi-head attention (LSTM-A)—trained under a stabilized Deep Q-Learning framework. For short sequences, FFNN-R delivers faster convergence with ~25% fewer episodes, while for longer sequences, LSTM-A captures long-range dependencies and achieves best-known values, albeit with higher compute and memory demands. The results highlight complementary strengths: efficient local pattern learning by FFNN-R and robust long-range modeling by LSTM-A, suggesting fruitful directions for hybrid designs and scalable protein-folding strategies in lattice models.

Abstract

We address protein structure prediction in the 3D Hydrophobic-Polar lattice model through two novel deep learning architectures. For proteins under 36 residues, our hybrid reservoir-based model combines fixed random projections with trainable deep layers, achieving optimal conformations with 25% fewer training episodes. For longer sequences, we employ a long short-term memory network with multi-headed attention, matching best-known energy values. Both architectures leverage a stabilized Deep Q-Learning framework with experience replay and target networks, demonstrating consistent achievement of optimal conformations while significantly improving training efficiency compared to existing methods.

Protein Structure Prediction in the 3D HP Model Using Deep Reinforcement Learning

TL;DR

The study addresses protein structure prediction in the 3D HP lattice model, reframing folding as energy minimization via hydrophobic contacts with . It introduces two DRL architectures—a reservoir-based hybrid (FFNN-R) and an LSTM with multi-head attention (LSTM-A)—trained under a stabilized Deep Q-Learning framework. For short sequences, FFNN-R delivers faster convergence with ~25% fewer episodes, while for longer sequences, LSTM-A captures long-range dependencies and achieves best-known values, albeit with higher compute and memory demands. The results highlight complementary strengths: efficient local pattern learning by FFNN-R and robust long-range modeling by LSTM-A, suggesting fruitful directions for hybrid designs and scalable protein-folding strategies in lattice models.

Abstract

We address protein structure prediction in the 3D Hydrophobic-Polar lattice model through two novel deep learning architectures. For proteins under 36 residues, our hybrid reservoir-based model combines fixed random projections with trainable deep layers, achieving optimal conformations with 25% fewer training episodes. For longer sequences, we employ a long short-term memory network with multi-headed attention, matching best-known energy values. Both architectures leverage a stabilized Deep Q-Learning framework with experience replay and target networks, demonstrating consistent achievement of optimal conformations while significantly improving training efficiency compared to existing methods.
Paper Structure (17 sections, 6 equations, 11 figures, 4 tables)

This paper contains 17 sections, 6 equations, 11 figures, 4 tables.

Figures (11)

  • Figure 1: Deep reinforcement learning training loop. In a), we sample a batch of experience from the buffer. The batch then serves as input to the Q - network in b). Based on the output Q - value tensor, the agent makes a decision in c) to take $a_{t+1}$ that corresponds to the greatest value of the Q - output tensor. In step d), the experience is stored in the replay memory..
  • Figure 2: The input layer consists of a (N, 8, 1) tensor representing the state at a particular timestep. The reservoir is a randomly initialized weight matrix with a topology specified beforehand. The linear layers consists of a simple fully-connected feed forward neural network. The output is a (5, 1) tensor representing the Q - value or future expected total reward per action.
  • Figure 3: LSTM-A architecture for protein folding. Sequential states are processed through LSTM cells, generating hidden states that are weighted by an 8-head attention mechanism. The attention output is mapped to action Q-values through a fully connected layer, enabling the model to leverage both sequential patterns and long-range dependencies.
  • Figure 4: Least Energy Conformations for different sequences.
  • Figure 5: Plots a) 3d1 and b) 3d5 show the minimum conformation energy as a function of episode.
  • ...and 6 more figures