Table of Contents
Fetching ...

Folding lattice proteins confined on minimal grids using a quantum-inspired encoding

Anders Irbäck, Lucas Knuthson, Sandipan Mohanty

TL;DR

The study tackles the difficult problem of finding minimum-energy, maximally compact lattice-protein structures under steric constraints by recasting it as a quadratic unconstrained binary optimization (QUBO) problem using a field-like binary encoding. It benchmarks hybrid quantum-classical annealing (HA), simulated annealing (SA), and a Gurobi optimizer (GO) against exhaustive enumeration for six 48-residue sequences on a $4\times4\times3$ lattice, showing HA and SA can rapidly reach the ground state while GO struggles due to non-linear connectivity terms. The key contribution is demonstrating that QUBO-based methods can swiftly solve dense lattice-protein problems ($N=48$) and potentially extend to multi-chain systems, offering a practical route around high-energy barriers in sterically constrained landscapes. This work opens a pathway for applying quantum-inspired optimization to dense biomolecular problems and related scheduling-type tasks, with implications for understanding folding in crowded environments and designing efficient encodings for constraint-heavy combinatorial problems.

Abstract

Steric clashes pose a challenge when exploring dense protein systems using conventional explicit-chain methods. A minimal example is a single lattice protein confined on a minimal grid, with no free sites. Finding its minimum energy is a hard optimization problem, withsimilarities to scheduling problems. It can be recast as a quadratic unconstrained binary optimization (QUBO) problem amenable to classical and quantum approaches. We show that this problem in its QUBO form can be swiftly and consistently solved for chain length 48, using either classical simulated annealing or hybrid quantum-classical annealing on a D-Wave system. In fact, the latter computations required about 10 seconds. We also test linear and quadratic programming methods, which work well for a lattice gas but struggle with chain constraints. All methods are benchmarked against exact results obtained from exhaustive structure enumeration, at a high computational cost.

Folding lattice proteins confined on minimal grids using a quantum-inspired encoding

TL;DR

The study tackles the difficult problem of finding minimum-energy, maximally compact lattice-protein structures under steric constraints by recasting it as a quadratic unconstrained binary optimization (QUBO) problem using a field-like binary encoding. It benchmarks hybrid quantum-classical annealing (HA), simulated annealing (SA), and a Gurobi optimizer (GO) against exhaustive enumeration for six 48-residue sequences on a lattice, showing HA and SA can rapidly reach the ground state while GO struggles due to non-linear connectivity terms. The key contribution is demonstrating that QUBO-based methods can swiftly solve dense lattice-protein problems () and potentially extend to multi-chain systems, offering a practical route around high-energy barriers in sterically constrained landscapes. This work opens a pathway for applying quantum-inspired optimization to dense biomolecular problems and related scheduling-type tasks, with implications for understanding folding in crowded environments and designing efficient encodings for constraint-heavy combinatorial problems.

Abstract

Steric clashes pose a challenge when exploring dense protein systems using conventional explicit-chain methods. A minimal example is a single lattice protein confined on a minimal grid, with no free sites. Finding its minimum energy is a hard optimization problem, withsimilarities to scheduling problems. It can be recast as a quadratic unconstrained binary optimization (QUBO) problem amenable to classical and quantum approaches. We show that this problem in its QUBO form can be swiftly and consistently solved for chain length 48, using either classical simulated annealing or hybrid quantum-classical annealing on a D-Wave system. In fact, the latter computations required about 10 seconds. We also test linear and quadratic programming methods, which work well for a lattice gas but struggle with chain constraints. All methods are benchmarked against exact results obtained from exhaustive structure enumeration, at a high computational cost.

Paper Structure

This paper contains 16 sections, 2 equations, 5 figures, 1 table.

Figures (5)

  • Figure 1: Run-time evolution of the interaction potential $E_{\text{MJ}}$ and the penalty energies $E_1$, $E_2$ and $E_3$ [Eqs. (\ref{['eq:EMJ']}--\ref{['eq:E3']})] in an SA run for sequence 5 (Table \ref{['tab:seq']}), using $\bm{\lambda}=(1.5,2.0,2.0)$, 25 temperatures and 10,000 sweeps at each temperature. The horizontal lines indicate data on the distribution of $E_{\text{MJ}}$ over valid chain structures, as obtained by exhaustive enumerations (Sec. \ref{['sec:methods_enumeration']}). The lowest line represents the minimum $E_{\text{MJ}}$, $E_{\text{MJ}}^{\min}$. The other three lines represent the lowest $q$-quantiles for $q=2$ (median), $q=100$ and $q=10^8$, respectively.
  • Figure 2: Average final energy, $\overline{E_f}$, plotted against run time, $t$, in HA, SA and GO computations for the six sequences in Table \ref{['tab:seq']}, with Lagrange parameters $\bm{\lambda}=(1.5,2.0,2.0)$ [Eq. (\ref{['eq:E']})]. Each data point represent an average over 10 (GO) or 100 (HA and SA) independent runs. The horizontal lines indicate data on the distribution of $E_{\text{MJ}}$ over valid chain structures, as obtained by exhaustive enumerations (Sec. \ref{['sec:methods_enumeration']}). The lowest line represents $E_{\text{MJ}}^{\min}$. The other three represent the lowest $q$-quantiles for $q=2$ (median), $q=100$ and $q=10^8$, respectively. (a) Sequence 1. (b) Sequence 2. (c) Sequence 3. (d) Sequence 4. (e) Sequence 5. (f) Sequence 6.
  • Figure 3: Histograms of the final energy $E_f$ [Eq. (\ref{['eq:E']})] on a log scale for HA and SA computations for the six sequences in Table \ref{['tab:seq']}, for the shortest (fast) and longest (slow) run times used. The insets show the density of states (log scale) calculated as a function of $E_{\text{MJ}}$. (a) Sequence 1. (b) Sequence 2. (c) Sequence 3. (d) Sequence 4. (e) Sequence 5. (f) Sequence 6.
  • Figure 4: Energy landscapes showing $E_{\text{MJ}}-E_{\text{MJ}}^{\min}$ against the overlap $Q$ with the native state for the 100 lowest-lying maximally compact states for each of the six sequences in Table \ref{['tab:seq']}, as obtained through exhaustive enumeration (Sec. \ref{['sec:methods_enumeration']}, Appendix \ref{['sec:app_enumeration']}). The overlap, or nativeness, $Q$ of a structure is the fraction of ground state contacts it contains. (a) Sequences 1--3, with low-complexity topology A as their native state. (a) Sequences 4--6, with high-complexity topology B as their native state.
  • Figure 5: Parameter dependence of the fraction of correct solutions (hit rate) in the vicinity of the best Lagrange parameters found, $\bm{\lambda}^*=(1.5,2.0,2.0)$, when using QUBO SA and QUBO HA to search for the ground state of sequence 5 (Table 1 in the main text). The hit rate is plotted against $\Delta \lambda_i=\lambda_i-\lambda_i^*$, keeping $\lambda_j= \lambda_j^*$ for $j\neq i$. Each data point represents an average over 100 runs. The SA runs comprised 10,000 sweeps per temperature, while the HA run time was 10 s. Lines are drawn to guide the eye. (a) SA (b) HA.