Table of Contents
Fetching ...

Hybrid Quantum-Classical Encoding for Accurate Residue-Level pKa Prediction

Van Le, Tan Le

TL;DR

A reproducible hybrid quantum-classical framework that enriches residue-level representations with a Gaussian kernel-based quantum-inspired feature mapping and combined with normalized structural features to form a unified hybrid encoding processed by a Deep Quantum Neural Network (DQNN).

Abstract

Accurate prediction of residue-level pKa values is essential for understanding protein function, stability, and reactivity. While existing resources such as DeepKaDB and CpHMD-derived datasets provide valuable training data, their descriptors remain primarily classical and often struggle to generalize across diverse biochemical environments. We introduce a reproducible hybrid quantum-classical framework that enriches residue-level representations with a Gaussian kernel-based quantum-inspired feature mapping. These quantum-enhanced descriptors are combined with normalized structural features to form a unified hybrid encoding processed by a Deep Quantum Neural Network (DQNN). This architecture captures nonlinear relationships in residue microenvironments that are not accessible to classical models. Benchmarking across multiple curated descriptor sets demonstrates that the DQNN achieves improved cross-context generalization relative to classical baselines. External evaluation on the PKAD-R experimental benchmark and an A$β$40 case study further highlights the robustness and transferability of the quantum-inspired representation. By integrating quantum-inspired feature transformations with classical biochemical descriptors, this work establishes a scalable and experimentally transferable approach for residue-level pKa prediction and broader applications in protein electrostatics.

Hybrid Quantum-Classical Encoding for Accurate Residue-Level pKa Prediction

TL;DR

A reproducible hybrid quantum-classical framework that enriches residue-level representations with a Gaussian kernel-based quantum-inspired feature mapping and combined with normalized structural features to form a unified hybrid encoding processed by a Deep Quantum Neural Network (DQNN).

Abstract

Accurate prediction of residue-level pKa values is essential for understanding protein function, stability, and reactivity. While existing resources such as DeepKaDB and CpHMD-derived datasets provide valuable training data, their descriptors remain primarily classical and often struggle to generalize across diverse biochemical environments. We introduce a reproducible hybrid quantum-classical framework that enriches residue-level representations with a Gaussian kernel-based quantum-inspired feature mapping. These quantum-enhanced descriptors are combined with normalized structural features to form a unified hybrid encoding processed by a Deep Quantum Neural Network (DQNN). This architecture captures nonlinear relationships in residue microenvironments that are not accessible to classical models. Benchmarking across multiple curated descriptor sets demonstrates that the DQNN achieves improved cross-context generalization relative to classical baselines. External evaluation on the PKAD-R experimental benchmark and an A40 case study further highlights the robustness and transferability of the quantum-inspired representation. By integrating quantum-inspired feature transformations with classical biochemical descriptors, this work establishes a scalable and experimentally transferable approach for residue-level pKa prediction and broader applications in protein electrostatics.
Paper Structure (23 sections, 12 equations, 2 figures, 2 tables)

This paper contains 23 sections, 12 equations, 2 figures, 2 tables.

Figures (2)

  • Figure 1: Schematic overview contrasting dataset origins and methodological innovations in residue‑level pKa prediction. (Left) DeepKaDB: descriptor‑driven resource derived from soluble proteins in PDBbind, providing curated feature sets for classical machine learning models. (Center) PHMD549: simulation‑driven dataset generated via GPU‑accelerated CpHMD, expanding PHMD279 to 26,552 residues across 549 proteins. (Right) DQNN framework: hybrid quantum–classical pipeline that integrates curated descriptors with quantum‑inspired feature transformations.
  • Figure 2: Comparison of A$\beta$40 histidine pKa predictions using experimental measurements, DeepKa, and the proposed DQNN model. Error bars indicate reported standard deviations for DeepKa and replicate variability for DQNN.