Quantum Computing-Enhanced Algorithm Unveils Novel Inhibitors for KRAS

Mohammad Ghazi Vakili; Christoph Gorgulla; AkshatKumar Nigam; Dmitry Bezrukov; Daniel Varoli; Alex Aliper; Daniil Polykovsky; Krishna M. Padmanabha Das; Jamie Snider; Anna Lyakisheva; Ardalan Hosseini Mansob; Zhong Yao; Lela Bitar; Eugene Radchenko; Xiao Ding; Jinxin Liu; Fanye Meng; Feng Ren; Yudong Cao; Igor Stagljar; Alán Aspuru-Guzik; Alex Zhavoronkov

Quantum Computing-Enhanced Algorithm Unveils Novel Inhibitors for KRAS

Mohammad Ghazi Vakili, Christoph Gorgulla, AkshatKumar Nigam, Dmitry Bezrukov, Daniel Varoli, Alex Aliper, Daniil Polykovsky, Krishna M. Padmanabha Das, Jamie Snider, Anna Lyakisheva, Ardalan Hosseini Mansob, Zhong Yao, Lela Bitar, Eugene Radchenko, Xiao Ding, Jinxin Liu, Fanye Meng, Feng Ren, Yudong Cao, Igor Stagljar, Alán Aspuru-Guzik, Alex Zhavoronkov

TL;DR

This work proposes a hybrid quantum-classical framework that uses a Quantum Circuit Born Machine (QCBM) prior integrated with a classical Long Short-Term Memory (LSTM) generator to design KRAS inhibitors, guided by Chemistry42 rewards. Computational benchmarks (Tartarus) suggest the quantum-enhanced approach improves distribution matching and sample quality, while a KRAS inhibitor design campaign yields two experimentally validated ligands, ISM061-018-2 and ISM061-22, with distinct mutation-specific profiles. The results demonstrate practical applicability of near-term quantum hardware to drug discovery and reveal a roughly linear relationship between qubit count and design performance, supporting scalable quantum-classical exploration of chemical space. Although a true quantum advantage is not demonstrated, the study provides a solid stepping stone for quantum-assisted workflows in medicinal chemistry and highlights the potential for larger, more capable quantum resources to further amplify design efficacy.

Abstract

The discovery of small molecules with therapeutic potential is a long-standing challenge in chemistry and biology. Researchers have increasingly leveraged novel computational techniques to streamline the drug development process to increase hit rates and reduce the costs associated with bringing a drug to market. To this end, we introduce a quantum-classical generative model that seamlessly integrates the computational power of quantum algorithms trained on a 16-qubit IBM quantum computer with the established reliability of classical methods for designing small molecules. Our hybrid generative model was applied to designing new KRAS inhibitors, a crucial target in cancer therapy. We synthesized 15 promising molecules during our investigation and subjected them to experimental testing to assess their ability to engage with the target. Notably, among these candidates, two molecules, ISM061-018-2 and ISM061-22, each featuring unique scaffolds, stood out by demonstrating effective engagement with KRAS. ISM061-018-2 was identified as a broad-spectrum KRAS inhibitor, exhibiting a binding affinity to KRAS-G12D at $1.4 μM$. Concurrently, ISM061-22 exhibited specific mutant selectivity, displaying heightened activity against KRAS G12R and Q61H mutants. To our knowledge, this work shows for the first time the use of a quantum-generative model to yield experimentally confirmed biological hits, showcasing the practical potential of quantum-assisted drug discovery to produce viable therapeutics. Moreover, our findings reveal that the efficacy of distribution learning correlates with the number of qubits utilized, underlining the scalability potential of quantum computing resources. Overall, we anticipate our results to be a stepping stone towards developing more advanced quantum generative models in drug discovery.

Quantum Computing-Enhanced Algorithm Unveils Novel Inhibitors for KRAS

TL;DR

Abstract

. Concurrently, ISM061-22 exhibited specific mutant selectivity, displaying heightened activity against KRAS G12R and Q61H mutants. To our knowledge, this work shows for the first time the use of a quantum-generative model to yield experimentally confirmed biological hits, showcasing the practical potential of quantum-assisted drug discovery to produce viable therapeutics. Moreover, our findings reveal that the efficacy of distribution learning correlates with the number of qubits utilized, underlining the scalability potential of quantum computing resources. Overall, we anticipate our results to be a stepping stone towards developing more advanced quantum generative models in drug discovery.

Paper Structure (22 sections, 4 equations, 7 figures, 2 tables)

This paper contains 22 sections, 4 equations, 7 figures, 2 tables.

Introduction
Results and Discussion
Computational Benchmarks - Classical vs Quantum Models
Tartarus Benchmark
Benchmarking of Prior distributions
KRAS Inhibitor Design Campaign
Chemistry42 Post-Screening and Selection of Promising Candidate Structures for Synthesis
Experimental Evaluation of Generated Compounds
Conclusion
Acknowledgements
Methods
Data Acquisition and Pre-processing
STONED-SELFIES
Virtual Screening Process
Quantum Assisted Algorithm
...and 7 more sections

Figures (7)

Figure 1: Schematic Representation of the Hybrid Quantum-Classical Framework for KRAS Ligand Development. The initial phase concentrates on compiling a dataset for model training. A curated set of 650 experimentally verified inhibitors targeting the KRAS protein is extracted from the literature. By applying the STONED-SELFIES algorithm, analogs for each identified compound are derived, yielding an expanded collection of around 850,000 compounds. This dataset is further enhanced by the addition of the top 250,000 candidates, identified via a virtual screening process using the REAL ligand library against the KRAS protein, culminating in a dataset of over 1 million molecules for training our generative model. Upon completing the training of our model, new molecules targeting KRAS are created employing both a classical LSTM network and a Quantum Circuit Born Machine (QCBM) as the underlying generative frameworks. The LSTM network processes sequential data encapsulating the chemical structures of ligands, while QCBM, trained based on the quality of LSTM-generated samples, creates complex, high-dimensional probability distributions. The combined workflow utilizes Chemistry42 as a reward function to incentivize the creation of structurally diverse and synthesizable molecules.
Figure 2: Quantum-Enhanced Generative Model for Drug Discovery Applications.(A) Hybrid model combining a Quantum Circuit Born Machine (QCBM) with Long Short-Term Memory (LSTM). This model iteratively trains using prior samples from quantum hardware. (B) Integration method of prior samples into the LSTM architecture. Molecular information (in SELFIES encoding) and quantum data are merged by addition or concatenation. The resultant samples, $X'(t)$, are then input to the LSTM cell. (C) Quantum prior component described as a QCBM, generating samples from quantum hardware each training epoch and trains with a reward value, $P(x) = \text{Softmax}(R(x))$, calculated using Chemistry42 or a local filter. (D) Process of experimental sample selection: 1 million compounds are sampled from each model—classical samples (via vanilla LSTM), quantum samples (QCBM on quantum hardware), and simulated samples (quantum simulation on classical hardware). These samples undergo evaluation by Chemistry42, filtering out compounds unsuitable for pharmacological purposes and ranking the remaining compounds by their docking score (PLI score). Subsequently, 15 novel compounds were selected for synthesis.
Figure 3: Comparative Benchmarking of Quantum and Classical Ligand Design Methods.(A) Evaluation of the proposed model against classical counterparts using the Tartarus benchmark suite nigam_tartarus_2023 for ligand design across three protein targets: 1SYH, 6Y2F, 4LDE, with models trained on a subset of the DTP Open Compound Collection. Displayed metrics show both the average and the variability (mean ± standard deviation) of the optimal objective values for the targets, compiled from five individual experiments. 'Dataset' refers to the molecule with the highest performance in the training dataset, whereas 'Native Docking' indicates the initial ligands within their crystallographic structures. The notation $\Delta E_{X}$ signifies the docking score relative to the protein target designated by $X$. SR stands for the success ratio, indicating the percentage of molecules that meet the predefined structural benchmarks. (B) Comparative analysis of our hybrid approaches with varied priors. The performance of the Quantum Circuit Born Machine (QCBM) was assessed using both a quantum simulator (Sim) and a hardware backend (HW), and contrasted with a Multi-bases QCBM (MQCBM) operating solely on a quantum simulator (SIM), as well as an LSTM model devoid of quantum priors (representing a fully classical architecture). We calculated the number of generated molecules that met a series of synthesizability and stability criteria as stipulated by the Tartarus benchmarking platform (referred to as Local Filters) and by Chemistry42 (referred to as Chemistry42 Filters). (C) Comparative analysis of prior sampling techniques in producing high-docking molecules, as assessed by QuickVina2 and Chemistry42. This comparison delineates the Success Rate (SR %) of molecules meeting Tartarus filter criteria, the uniqueness of generated ligands (Unique Fraction, UF %), and the Structural Diversity Fraction (DF %) of the generated ligands, across various methods. the success rates (SR% Ch42) of molecules meeting Chemistry42's filter criteria, the top reward values (R Ch42) assigned to molecules by Chemistry42, the synthetic accessibility score (SA Ch42) of drug-like molecules, and the highest PLI Ch42 scores found in the generation. The PLI score is measured in kcal/mol, with more negative values indicating better scores. (D) Success rate of generating molecules that meet Tartarus's filter criteria as a function of the number of qubits used in modeling priors for the QCBM.
Figure 4: Pharmacological Characterization of Compound ISM061-018-2 Through Surface Plasmon Resonance and Cellular Activity Assays.(A) Chemical structure of ISM061-018-2. (B) Surface Plasmon Resonance (SPR) sensorgram illustrating the binding kinetics of ISM061-018-2 with various KRAS proteins. (C) Results of Cell-Titer-Glo viability assays illustrating the impact of the compound on cellular proliferation across a concentration range from 123 nM to 30 µM. Reported values represent the mean of three technical replicates, with standard deviation (S.D.) indicated. (D, Table) A compendium of IC50 values derived from MaMTH-DS dose-response assays, conducted in biological triplicate, evaluating a range of RAS protein baits interactions with the RAF1 prey partner. Investigated RAS members include the wild-type forms of KRAS, HRAS, and NRAS, alongside five oncogenic KRAS mutants of clinical significance. The interaction between EGFR and the SHCI adapter was additionally examined as an off-target control. We provide 95% confidence intervals and R-squared values to verify the accuracy of the curve fitting. (D,E): Dose-response curves from MaMTH-DS assays graphing the modulation of activity of various KRAS proteins, NRAS, HRAS, and EGFR, in response to increasing concentrations of ISM061-018-2 (from 4 nM to 30 µM), plotted on a logarithmic scale. The curves displayed represent one set from three biological replicates. Each point denotes the mean of three to four technical replicates, with S.D. provided. Curve fitting was executed in GraphPad Prism as delineated in the Methods section. These profiles underscore the compound's differential potency against distinct targets, shedding light on its pharmacological spectrum.
Figure 5: Pharmacological Evaluation of Compound ISM061-22 Against KRAS Variants and Other Related Proteins. (A) Chemical structure of ISM061-22. (B) Results from Cell-Titer-Glo viability assays depicting the effect of the compound on cellular proliferation over a concentration range from 123 nM to 30 µM. The values represent the mean of three technical replicates, with standard deviation (S.D.) indicated. (C, Table) Summary of IC50 values derived from MaMTH-DS dose-response assays conducted in biological triplicate, against various RAS protein baits' interactions with the RAF1 prey partner. Tested RAS members include wild-type KRAS, HRAS, and NRAS, as well as five clinically significant oncogenic KRAS mutants. The EGFR's interaction with the SHC1 adapter was also examined as an off-target control. The 95% confidence intervals and R-squared values are reported to confirm the precision of the curve fitting. (C,D) Dose-response curves from MaMTH-DS assays, illustrating the modulation of activity of different KRAS proteins, NRAS, HRAS, and EGFR, by increasing concentrations of ISM061-22 (from 4 nM to 30 µM), presented on a logarithmic scale. The curves represent one instance from three biological replicates. Each data point is the average of three to four technical replicates, with S.D. presented. Curve fitting procedures were executed in GraphPad Prism, as outlined in the Methods section. The data emphasizes the compound's nuanced effectiveness against various protein targets, illuminating its potential therapeutic value.
...and 2 more figures

Quantum Computing-Enhanced Algorithm Unveils Novel Inhibitors for KRAS

TL;DR

Abstract

Quantum Computing-Enhanced Algorithm Unveils Novel Inhibitors for KRAS

Authors

TL;DR

Abstract

Table of Contents

Figures (7)