Table of Contents
Fetching ...

Practical protein-pocket hydration-site prediction for drug discovery on a quantum computer

Daniele Loco, Kisa Barkemeyer, Andre R. R. Carvalho, Jean-Philip Piquemal

TL;DR

This work formats the water placement problem as a Quadratic Unconstrained Binary Optimization (QUBO), and uses a hybrid approach coupling a classical three-dimensional reference-interaction site model (3D-RISM) to an efficient quantum optimization solver, to run various hardware experiments up to 123 qubits.

Abstract

Demonstrating the practical utility of Noisy Intermediate-Scale Quantum (NISQ) hardware for recurrent tasks in Computer-Aided Drug Discovery is of paramount importance. We tackle this challenge by performing three-dimensional protein pockets hydration-site prediction on a quantum computer. Formulating the water placement problem as a Quadratic Unconstrained Binary Optimization (QUBO), we use a hybrid approach coupling a classical three-dimensional reference-interaction site model (3D-RISM) to an efficient quantum optimization solver, to run various hardware experiments up to 123 qubits. Matching the precision of classical approaches, our results reproduced experimental predictions on real-life protein-ligand complexes. Furthermore, through a detailed resource estimation analysis, we show that accuracy can be systematically improved with increasing number of qubits, indicating that full quantum utility is in reach. Finally, we provide evidence that advantageous situations could be found for systems where classical optimization struggles to provide optimal solutions. The method has potential for assisting simulations of protein-ligand complexes for drug lead optimization and setup of docking calculations.

Practical protein-pocket hydration-site prediction for drug discovery on a quantum computer

TL;DR

This work formats the water placement problem as a Quadratic Unconstrained Binary Optimization (QUBO), and uses a hybrid approach coupling a classical three-dimensional reference-interaction site model (3D-RISM) to an efficient quantum optimization solver, to run various hardware experiments up to 123 qubits.

Abstract

Demonstrating the practical utility of Noisy Intermediate-Scale Quantum (NISQ) hardware for recurrent tasks in Computer-Aided Drug Discovery is of paramount importance. We tackle this challenge by performing three-dimensional protein pockets hydration-site prediction on a quantum computer. Formulating the water placement problem as a Quadratic Unconstrained Binary Optimization (QUBO), we use a hybrid approach coupling a classical three-dimensional reference-interaction site model (3D-RISM) to an efficient quantum optimization solver, to run various hardware experiments up to 123 qubits. Matching the precision of classical approaches, our results reproduced experimental predictions on real-life protein-ligand complexes. Furthermore, through a detailed resource estimation analysis, we show that accuracy can be systematically improved with increasing number of qubits, indicating that full quantum utility is in reach. Finally, we provide evidence that advantageous situations could be found for systems where classical optimization struggles to provide optimal solutions. The method has potential for assisting simulations of protein-ligand complexes for drug lead optimization and setup of docking calculations.

Paper Structure

This paper contains 19 sections, 6 equations, 9 figures, 2 tables.

Figures (9)

  • Figure 1: Cost distributions for the hydration-site prediction QUBO problem instance d as specified in Table \ref{['tab:systems']}, requiring 116 qubits. The Q-CTRL solver output obtained using IBM Kingston (purple bars) is compared to the optimal solution calculated classically with CPLEX (dashed red line). For reference, the results of simulated annealing (SA, teal bars) as well as the results of a greedy local solver (gray bars) are displayed. Inset: Zoomed-in view of the near-optimal region using finer bins.
  • Figure 2: Probability to sample the optimal solution using the Q-CTRL solver on IBM Kingston (purple bars), simulated annealing (SA, teal bars), and a greedy local solver (gray bars) for the test-set instances labeled according to Table \ref{['tab:systems']}, with variable numbers indicated in parentheses.
  • Figure 3: Results for the 3b7e-with-ligand instance with a grid spacing of 1.9 Å corresponding to 123 variables. (a) Exact classical (CPLEX) results. Main: Incumbent solution (red line) over time compared to the best solution identified using the Q-CTRL solver on IBM Pittsburgh (dashed purple line). Inset: Optimality gap over time. (b) Cost distributions for the Q-CTRL solver on IBM Pittsburgh (purple bars), simulated annealing (teal bars), and a greedy local solver (gray bars) shown together with the best CPLEX solution (dashed red line).
  • Figure 4: Probability to sample the optimal solution using the Q-CTRL solver on the Heron r2 backend ibm_kingston (red bars) and the Heron r3 backend ibm_pittsburgh (blue bars) for a selection of the test-set instances labeled according to Table \ref{['tab:systems']}, with variable numbers indicated in parentheses.
  • Figure 5: Analysis of the hydration-sites prediction performance across different QUBO instance sizes for PDB ID $=$ 3b7e from Table \ref{['tab:test_set']}, including the ligand in the 3D-RISM calculation. Top panel shows the performance analysis for the quantum optimization run using the Q-CTRL solver on the IBM devices, on the QUBO instances obtained from the following parameters: $\sigma^2 = 1.0~\textbf{\AA}^2$ and $\tau_g = 0.1$ are used uniformly, and, from smaller to larger instances, $\delta = 1.35, 1.15, 0.95~\text{\AA}$, to discretize the 3D-RISM density for the mapping onto the QUBO problem. Bottom panel shows the same analysis performed on the classical SA results on QUBO instances obtained from the following parameters: $\sigma^2 = 1.0~\textbf{\AA}^2$ and, from smaller to larger instances, $\tau_g = 0.1, 0.05, 0.002$ and $\delta = 0.5, 0.5, 0.35~\text{\AA}$. For each panel, we report: on the left-side plot, P* (closest water placement precision), $<$P$>$ (cluster-averaged precision) and C (fraction of crystal waters identified), and on the right-side of the plot $<$CS$>$ (average cluster size); error bars show the 95% confidence interval for $<$P$>$ and $<$CS$>$. Each metric is computed as described in the Methods section, extracting the PWs corresponding to the best solution obtained from the corresponding optimization
  • ...and 4 more figures