Fast tensor-based electrostatic energy calculations in the perspective of protein-ligand docking problem
Peter Benner, Boris N. Khoromskij, Venera Khoromskaia, Matthias Stein
TL;DR
This work addresses the bottleneck of computing electrostatic energies in protein-ligand docking for large biomolecular systems. It introduces a range-separated (RS) tensor framework that represents the free-space electrostatic potential on large 3D grids and achieves near-linear $O(n)$ complexity for energy evaluation, with the number of particles $N$ exerting only a logarithmic influence. By decomposing the potential into long-range and short-range components and precomputing a low-rank CP representation of the long-range part, the method reduces energy and gradient evaluations for arbitrary ligand positions to $O(R N_L)$ per configuration, independent of protein size. Numerical experiments on synthetic data and moderate-size biomolecules demonstrate accurate energy assessments and the feasibility of blind docking searches using only electrostatic contributions, suggesting this approach can augment existing stochastic or deterministic docking schemes with substantial speedups.
Abstract
We propose and justify a new approach for fast calculation of the electrostatic interaction energy of clusters of charged particles in constrained energy minimization in the framework of rigid protein-ligand docking. Our ``blind search'' docking technique is based on the low-rank range-separated (RS) tensor-based representation of the free-space electrostatic potential of the biomolecule represented on large $n\times n\times n$ 3D grid. We show that both the collective electrostatic potential of a complex protein-ligand system and the respective electrostatic interaction energy can be calculated by tensor techniques in $O(n)$-complexity, such that the numerical cost for energy calculation only mildly (logarithmically) depends on the number of particles in the system. Moreover, tensor representation of the electrostatic potential enables usage of large 3D Cartesian grids (of the order of $n^3 \sim 10^{12}$), which could allow the accurate modeling of complexes with several large proteins. In our approach selection of the correct geometric pose predictions in the localized posing process is based on the control of van der Waals distance between the target molecular clusters. Here, we confine ourselves by constrained minimization of the energy functional by using only fast tensor-based free-space electrostatic energy recalculation for various rotations and translations of both clusters. Numerical tests of the electrostatic energy-based ``protein-ligand docking'' algorithm applied to synthetic and realistic input data present a proof of concept for rather complex particle configurations. The method may be used in the framework of the traditional stochastic or deterministic posing/docking techniques.
